Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
       [not found]                     ` <79975567.20140325162942@eikelenboom.it>
@ 2014-03-26 11:11                       ` Sander Eikelenboom
  2014-03-26 14:44                         ` Paul Durrant
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2014-03-26 11:11 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel, Ian Campbell,
	linux-kernel, netdev

Paul,

You have been awfully silent for this whole thread while this is a regression caused by a patch of you
(ca2f09f2b2c6c25047cfc545d057c4edfcfe561c as clearly stated much earlier in this thread).

The commit messages states:
    "net_rx_action() is the place where we could do with an accurate predicition but,
    since that has proven tricky to calculate, a cheap worse-case (but not too bad)
    estimate is all we really need since the only thing we *must* prevent is xenvif_gop_skb()
    consuming more slots than are available."

Your "worst-case" calculation stated in the commit message is clearly not the worst case,
since it doesn't take calls to "get_next_rx_buffer" into account.

Problem is that a worst case calculation would probably be reverting to the old calculation,
and the problems this patch was trying to solve would reappear, but introducing new regressions
isn't very useful either. And since it seems such a tricky and fragile thing to determine, it would
probably be best to be split into a distinct function with a comment to explain the rationale used.

Since this doesn't seem to progress very fast .. CC'ed some more folks .. you never know ..

--
Sander


Tuesday, March 25, 2014, 4:29:42 PM, you wrote:


> Tuesday, March 25, 2014, 4:15:39 PM, you wrote:

>> On Sat, Mar 22, 2014 at 07:28:34PM +0100, Sander Eikelenboom wrote:
>> [...]
>>> > Yes there is only one frag .. but it seems to be much larger than PAGE_SIZE .. and xenvif_gop_frag_copy brakes that frag down into smaller bits .. hence the calculation in xenvif_rx_action determining the slots needed by doing:
>>> 
>>> >                 for (i = 0; i < nr_frags; i++) {
>>> >                         unsigned int size;
>>> >                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
>>> >                         max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
>>> >                 }
>>> 
>>> > But the code in xenvif_gop_frag_copy .. seems to be needing one more slot (from the emperical test) .. and calling "get_next_rx_buffer" seems involved in that ..
>>> 
>>> Hmm looked again .. and it seems this is it .. when your frags are large enough you have the chance of running into this.
>>> 

>> get_next_rx_buffer is guarded by start_new_rx_buffer. Do you see any
>> problem with that implementation?
> In general no, but "get_next_rx_buffer" up's cons .. and the calculations done in "xenvif_rx_action" for max_slots_needed to prevent the overrun
> don't count in this possibility. So it's not the guarding of "start_new_rx_buffer" that is at fault. It's the ones early in "xenvif_rx_action".
> The ones that were changed by Paul's patch from MAX_SKB_FRAGS to a calculated value that should be a "slim fit".

> The problem is in determining upfront in "xenvif_rx_action" when and how often the "get_next_rx_buffer" path will be taken.
> Unless there are other non direct restrictions (from a size point of view) it can be called multiple times per frag per skb.

>>> Problem is .. i don't see an easy fix, the "one more slot" of the empirical test doesn't seem to be the worst case either (i think):
>>> 
>>> - In my case the packets that hit this only have 1 frag, but i could have had more frags.
>>>   I also think you can't rule out the possibility of doing the "get_next_rx_buffer" for multiple subsequent frags from one packet,
>>>   so in the worst (and perhaps even from a single frag since it's looped over a split of it in what seems PAGE_SIZE pieces.)
>>>   - So an exact calculation of how much slots we are going to need for hitting this "get_next_rx_buffer"  upfront in "xenvif_rx_action" seems unfeasible.
>>>   - A worst case gamble seems impossible either .. if you take multiple frags * multiple times the "get_next_rx_buffer" ... you would probably be back at just
>>>     setting the needed_slots to MAX_SKB_FRAGS.
>>> 
>>> - Other thing would be checking for the available slots before doing the  "get_next_rx_buffer" .. how ever .. we don't account for how many slots we still need to
>>>   just process the remaining frags.
>>> 

>> We've done a worst case estimation for whole SKB (linear area + all
>> frags) already, at the very first beginning. That's what
>> max_slots_needed is for.

>>> - Just remove the whole "get_next_rx_buffer".. just tested it but it seems the "get_next_rx_buffer" is necessary ..  when i make start_new_rx_buffer always return false
>>>   i hit the bug below :S
>>> 
>>> So the questions are ... is the "get_next_rx_buffer" part really necessary ?
>>>    - if not, what is the benefit of the "get_next_rx_buffer" and does that outweigh the additional cost of accounting
>>>      of needed_slots for the frags that have yet to be processed ?
>>>    - if yes, erhmmm what would be the best acceptable solution .. returning to using MAX_SKB_FRAGS ?
>>> 

>> I think you need to answer several questions first:
>> 1. is max_slots_needed actually large enough to cover whole SKB?
>         No it's not if, you end up calling "get_next_rx_buffer" one or multiple times when processing the SKB
>         you have the chance of overrunning (or be saved because prod gets upped before you overrun).

>> 2. is the test of ring slot availability acurate?
>         Seems to be.

>> 3. is the number of ring slots consumed larger than max_slots_needed? (I
>>    guess the answer is yes)
>         Yes that was the whole point.

>> 4. which step in the break-down procedure causes backend to overrun the
>>    ring?
>         The "get_next_rx_buffer" call .. that is not accounted for (which can be called
>         multiple times per frag (unless some other none obvious size restriction limits this
>         to one time per frag or one time per SKB).
>         In my errorneous case it is called one time, but it would be nice if there would be some theoretical proof
>         instead of just some emperical test.


>> It doesn't matter how many frags your SKB has and how big a frag is. If
>> every step is correct then you're fine. The code you're looking at
>> (start_new_rx_buffer / get_next_rx_buffer and friend) needs to be
>> carefully examined.

> Well it seems it only calls "get_next_rx_buffer" in some special cases .. (and that also what i'm seeing because it doesn't happen
> continously.

> Question is shouldn't this max_needed_slots calc be reverted to what it was before 3.14 and take the time in 3.15 to figure out a
> the theoretical max (if it can be calculated upfront) .. or another way to guarantee the code is able to process the whole SKB  ?

>> Wei.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 11:11                       ` [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected" Sander Eikelenboom
@ 2014-03-26 14:44                         ` Paul Durrant
  2014-03-26 15:22                           ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Paul Durrant @ 2014-03-26 14:44 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 26 March 2014 11:11
> To: Paul Durrant
> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
> kernel; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> troubles "bisected"
> 
> Paul,
> 
> You have been awfully silent for this whole thread while this is a regression
> caused by a patch of you
> (ca2f09f2b2c6c25047cfc545d057c4edfcfe561c as clearly stated much earlier in
> this thread).
> 

Sorry, I've been distracted...

> The commit messages states:
>     "net_rx_action() is the place where we could do with an accurate
> predicition but,
>     since that has proven tricky to calculate, a cheap worse-case (but not too
> bad)
>     estimate is all we really need since the only thing we *must* prevent is
> xenvif_gop_skb()
>     consuming more slots than are available."
> 
> Your "worst-case" calculation stated in the commit message is clearly not the
> worst case,
> since it doesn't take calls to "get_next_rx_buffer" into account.
> 

It should be taking into account the behaviour of start_new_rx_buffer(), which should be true if a slot is full or a frag will overflow the current slot and doesn't require splitting. The code in net_rx_action() makes the assumption that each frag will require as many slots as its size requires, i.e. it assumes no packing of multiple frags into a single slot, so it should be a worst case. Did I miss something in that logic?

  Paul

> Problem is that a worst case calculation would probably be reverting to the
> old calculation,
> and the problems this patch was trying to solve would reappear, but
> introducing new regressions
> isn't very useful either. And since it seems such a tricky and fragile thing to
> determine, it would
> probably be best to be split into a distinct function with a comment to explain
> the rationale used.
> 
> Since this doesn't seem to progress very fast .. CC'ed some more folks .. you
> never know ..
> 
> --
> Sander
> 
> 
> Tuesday, March 25, 2014, 4:29:42 PM, you wrote:
> 
> 
> > Tuesday, March 25, 2014, 4:15:39 PM, you wrote:
> 
> >> On Sat, Mar 22, 2014 at 07:28:34PM +0100, Sander Eikelenboom wrote:
> >> [...]
> >>> > Yes there is only one frag .. but it seems to be much larger than
> PAGE_SIZE .. and xenvif_gop_frag_copy brakes that frag down into smaller
> bits .. hence the calculation in xenvif_rx_action determining the slots needed
> by doing:
> >>>
> >>> >                 for (i = 0; i < nr_frags; i++) {
> >>> >                         unsigned int size;
> >>> >                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
> >>> >                         max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
> >>> >                 }
> >>>
> >>> > But the code in xenvif_gop_frag_copy .. seems to be needing one
> more slot (from the emperical test) .. and calling "get_next_rx_buffer"
> seems involved in that ..
> >>>
> >>> Hmm looked again .. and it seems this is it .. when your frags are large
> enough you have the chance of running into this.
> >>>
> 
> >> get_next_rx_buffer is guarded by start_new_rx_buffer. Do you see any
> >> problem with that implementation?
> > In general no, but "get_next_rx_buffer" up's cons .. and the calculations
> done in "xenvif_rx_action" for max_slots_needed to prevent the overrun
> > don't count in this possibility. So it's not the guarding of
> "start_new_rx_buffer" that is at fault. It's the ones early in
> "xenvif_rx_action".
> > The ones that were changed by Paul's patch from MAX_SKB_FRAGS to a
> calculated value that should be a "slim fit".
> 
> > The problem is in determining upfront in "xenvif_rx_action" when and how
> often the "get_next_rx_buffer" path will be taken.
> > Unless there are other non direct restrictions (from a size point of view) it
> can be called multiple times per frag per skb.
> 
> >>> Problem is .. i don't see an easy fix, the "one more slot" of the empirical
> test doesn't seem to be the worst case either (i think):
> >>>
> >>> - In my case the packets that hit this only have 1 frag, but i could have
> had more frags.
> >>>   I also think you can't rule out the possibility of doing the
> "get_next_rx_buffer" for multiple subsequent frags from one packet,
> >>>   so in the worst (and perhaps even from a single frag since it's looped
> over a split of it in what seems PAGE_SIZE pieces.)
> >>>   - So an exact calculation of how much slots we are going to need for
> hitting this "get_next_rx_buffer"  upfront in "xenvif_rx_action" seems
> unfeasible.
> >>>   - A worst case gamble seems impossible either .. if you take multiple
> frags * multiple times the "get_next_rx_buffer" ... you would probably be
> back at just
> >>>     setting the needed_slots to MAX_SKB_FRAGS.
> >>>
> >>> - Other thing would be checking for the available slots before doing the
> "get_next_rx_buffer" .. how ever .. we don't account for how many slots we
> still need to
> >>>   just process the remaining frags.
> >>>
> 
> >> We've done a worst case estimation for whole SKB (linear area + all
> >> frags) already, at the very first beginning. That's what
> >> max_slots_needed is for.
> 
> >>> - Just remove the whole "get_next_rx_buffer".. just tested it but it
> seems the "get_next_rx_buffer" is necessary ..  when i make
> start_new_rx_buffer always return false
> >>>   i hit the bug below :S
> >>>
> >>> So the questions are ... is the "get_next_rx_buffer" part really necessary
> ?
> >>>    - if not, what is the benefit of the "get_next_rx_buffer" and does that
> outweigh the additional cost of accounting
> >>>      of needed_slots for the frags that have yet to be processed ?
> >>>    - if yes, erhmmm what would be the best acceptable solution ..
> returning to using MAX_SKB_FRAGS ?
> >>>
> 
> >> I think you need to answer several questions first:
> >> 1. is max_slots_needed actually large enough to cover whole SKB?
> >         No it's not if, you end up calling "get_next_rx_buffer" one or multiple
> times when processing the SKB
> >         you have the chance of overrunning (or be saved because prod gets
> upped before you overrun).
> 
> >> 2. is the test of ring slot availability acurate?
> >         Seems to be.
> 
> >> 3. is the number of ring slots consumed larger than max_slots_needed? (I
> >>    guess the answer is yes)
> >         Yes that was the whole point.
> 
> >> 4. which step in the break-down procedure causes backend to overrun
> the
> >>    ring?
> >         The "get_next_rx_buffer" call .. that is not accounted for (which can be
> called
> >         multiple times per frag (unless some other none obvious size
> restriction limits this
> >         to one time per frag or one time per SKB).
> >         In my errorneous case it is called one time, but it would be nice if there
> would be some theoretical proof
> >         instead of just some emperical test.
> 
> 
> >> It doesn't matter how many frags your SKB has and how big a frag is. If
> >> every step is correct then you're fine. The code you're looking at
> >> (start_new_rx_buffer / get_next_rx_buffer and friend) needs to be
> >> carefully examined.
> 
> > Well it seems it only calls "get_next_rx_buffer" in some special cases ..
> (and that also what i'm seeing because it doesn't happen
> > continously.
> 
> > Question is shouldn't this max_needed_slots calc be reverted to what it
> was before 3.14 and take the time in 3.15 to figure out a
> > the theoretical max (if it can be calculated upfront) .. or another way to
> guarantee the code is able to process the whole SKB  ?
> 
> >> Wei.
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 14:44                         ` Paul Durrant
@ 2014-03-26 15:22                           ` Sander Eikelenboom
  2014-03-26 15:50                             ` Paul Durrant
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2014-03-26 15:22 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org


Wednesday, March 26, 2014, 3:44:42 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 26 March 2014 11:11
>> To: Paul Durrant
>> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
>> kernel; netdev@vger.kernel.org
>> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> troubles "bisected"
>> 
>> Paul,
>> 
>> You have been awfully silent for this whole thread while this is a regression
>> caused by a patch of you
>> (ca2f09f2b2c6c25047cfc545d057c4edfcfe561c as clearly stated much earlier in
>> this thread).
>> 

> Sorry, I've been distracted...

>> The commit messages states:
>>     "net_rx_action() is the place where we could do with an accurate
>> predicition but,
>>     since that has proven tricky to calculate, a cheap worse-case (but not too
>> bad)
>>     estimate is all we really need since the only thing we *must* prevent is
>> xenvif_gop_skb()
>>     consuming more slots than are available."
>> 
>> Your "worst-case" calculation stated in the commit message is clearly not the
>> worst case,
>> since it doesn't take calls to "get_next_rx_buffer" into account.
>> 

> It should be taking into account the behaviour of start_new_rx_buffer(), which should be true if a slot is full or a frag will overflow the current slot and doesn't require splitting.
> The code in net_rx_action() makes the assumption that each frag will require as many slots as its size requires, i.e. it assumes no packing of multiple frags into a single slot, so it should be a worst case.
> Did I miss something in that logic?

Yes.
In "xenvif_gop_skb()" this loop:

        for (i = 0; i < nr_frags; i++) {
                xenvif_gop_frag_copy(vif, skb, npo,
                                     skb_frag_page(&skb_shinfo(skb)->frags[i]),
                                     skb_frag_size(&skb_shinfo(skb)->frags[i]),
                                     skb_shinfo(skb)->frags[i].page_offset,
                                     &head);
        }

Is capable of using up (at least) 1 slot more than is anticipated for in "net_rx_action()"  by this code:

                for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
                        unsigned int size;
                        size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
                        max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
                }

And this happens when it calls "get_next_rx_buffer()" from "xenvif_gop_frag_copy()" where it's breaking down the frag.

Ultimately this results in bad grant reference warnings (and packets marked as "errors" in the interface statistics).

In my case it always seems to be a skb with 1 frag which is broken down in 5 or 6 pieces ..

So "get_next_rx_buffer()" is called once .. and i'm overrunning the ring with 1 slot, but i'm not sure if that's not coincedence
since in the code there seem to be no explicit limitation on how often this code path is taken. So perhaps it's implicitly limited
since packets and frags can't be arbitrarily large in comparison with the page_size but that's not something i'm capable of figuring out :-)



>   Paul

>> Problem is that a worst case calculation would probably be reverting to the
>> old calculation,
>> and the problems this patch was trying to solve would reappear, but
>> introducing new regressions
>> isn't very useful either. And since it seems such a tricky and fragile thing to
>> determine, it would
>> probably be best to be split into a distinct function with a comment to explain
>> the rationale used.
>> 
>> Since this doesn't seem to progress very fast .. CC'ed some more folks .. you
>> never know ..
>> 
>> --
>> Sander
>> 
>> 
>> Tuesday, March 25, 2014, 4:29:42 PM, you wrote:
>> 
>> 
>> > Tuesday, March 25, 2014, 4:15:39 PM, you wrote:
>> 
>> >> On Sat, Mar 22, 2014 at 07:28:34PM +0100, Sander Eikelenboom wrote:
>> >> [...]
>> >>> > Yes there is only one frag .. but it seems to be much larger than
>> PAGE_SIZE .. and xenvif_gop_frag_copy brakes that frag down into smaller
>> bits .. hence the calculation in xenvif_rx_action determining the slots needed
>> by doing:
>> >>>
>> >>> >                 for (i = 0; i < nr_frags; i++) {
>> >>> >                         unsigned int size;
>> >>> >                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
>> >>> >                         max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
>> >>> >                 }
>> >>>
>> >>> > But the code in xenvif_gop_frag_copy .. seems to be needing one
>> more slot (from the emperical test) .. and calling "get_next_rx_buffer"
>> seems involved in that ..
>> >>>
>> >>> Hmm looked again .. and it seems this is it .. when your frags are large
>> enough you have the chance of running into this.
>> >>>
>> 
>> >> get_next_rx_buffer is guarded by start_new_rx_buffer. Do you see any
>> >> problem with that implementation?
>> > In general no, but "get_next_rx_buffer" up's cons .. and the calculations
>> done in "xenvif_rx_action" for max_slots_needed to prevent the overrun
>> > don't count in this possibility. So it's not the guarding of
>> "start_new_rx_buffer" that is at fault. It's the ones early in
>> "xenvif_rx_action".
>> > The ones that were changed by Paul's patch from MAX_SKB_FRAGS to a
>> calculated value that should be a "slim fit".
>> 
>> > The problem is in determining upfront in "xenvif_rx_action" when and how
>> often the "get_next_rx_buffer" path will be taken.
>> > Unless there are other non direct restrictions (from a size point of view) it
>> can be called multiple times per frag per skb.
>> 
>> >>> Problem is .. i don't see an easy fix, the "one more slot" of the empirical
>> test doesn't seem to be the worst case either (i think):
>> >>>
>> >>> - In my case the packets that hit this only have 1 frag, but i could have
>> had more frags.
>> >>>   I also think you can't rule out the possibility of doing the
>> "get_next_rx_buffer" for multiple subsequent frags from one packet,
>> >>>   so in the worst (and perhaps even from a single frag since it's looped
>> over a split of it in what seems PAGE_SIZE pieces.)
>> >>>   - So an exact calculation of how much slots we are going to need for
>> hitting this "get_next_rx_buffer"  upfront in "xenvif_rx_action" seems
>> unfeasible.
>> >>>   - A worst case gamble seems impossible either .. if you take multiple
>> frags * multiple times the "get_next_rx_buffer" ... you would probably be
>> back at just
>> >>>     setting the needed_slots to MAX_SKB_FRAGS.
>> >>>
>> >>> - Other thing would be checking for the available slots before doing the
>> "get_next_rx_buffer" .. how ever .. we don't account for how many slots we
>> still need to
>> >>>   just process the remaining frags.
>> >>>
>> 
>> >> We've done a worst case estimation for whole SKB (linear area + all
>> >> frags) already, at the very first beginning. That's what
>> >> max_slots_needed is for.
>> 
>> >>> - Just remove the whole "get_next_rx_buffer".. just tested it but it
>> seems the "get_next_rx_buffer" is necessary ..  when i make
>> start_new_rx_buffer always return false
>> >>>   i hit the bug below :S
>> >>>
>> >>> So the questions are ... is the "get_next_rx_buffer" part really necessary
>> ?
>> >>>    - if not, what is the benefit of the "get_next_rx_buffer" and does that
>> outweigh the additional cost of accounting
>> >>>      of needed_slots for the frags that have yet to be processed ?
>> >>>    - if yes, erhmmm what would be the best acceptable solution ..
>> returning to using MAX_SKB_FRAGS ?
>> >>>
>> 
>> >> I think you need to answer several questions first:
>> >> 1. is max_slots_needed actually large enough to cover whole SKB?
>> >         No it's not if, you end up calling "get_next_rx_buffer" one or multiple
>> times when processing the SKB
>> >         you have the chance of overrunning (or be saved because prod gets
>> upped before you overrun).
>> 
>> >> 2. is the test of ring slot availability acurate?
>> >         Seems to be.
>> 
>> >> 3. is the number of ring slots consumed larger than max_slots_needed? (I
>> >>    guess the answer is yes)
>> >         Yes that was the whole point.
>> 
>> >> 4. which step in the break-down procedure causes backend to overrun
>> the
>> >>    ring?
>> >         The "get_next_rx_buffer" call .. that is not accounted for (which can be
>> called
>> >         multiple times per frag (unless some other none obvious size
>> restriction limits this
>> >         to one time per frag or one time per SKB).
>> >         In my errorneous case it is called one time, but it would be nice if there
>> would be some theoretical proof
>> >         instead of just some emperical test.
>> 
>> 
>> >> It doesn't matter how many frags your SKB has and how big a frag is. If
>> >> every step is correct then you're fine. The code you're looking at
>> >> (start_new_rx_buffer / get_next_rx_buffer and friend) needs to be
>> >> carefully examined.
>> 
>> > Well it seems it only calls "get_next_rx_buffer" in some special cases ..
>> (and that also what i'm seeing because it doesn't happen
>> > continously.
>> 
>> > Question is shouldn't this max_needed_slots calc be reverted to what it
>> was before 3.14 and take the time in 3.15 to figure out a
>> > the theoretical max (if it can be calculated upfront) .. or another way to
>> guarantee the code is able to process the whole SKB  ?
>> 
>> >> Wei.
>> 
>> 
>> 
>> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 15:22                           ` Sander Eikelenboom
@ 2014-03-26 15:50                             ` Paul Durrant
  2014-03-26 16:06                               ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Paul Durrant @ 2014-03-26 15:50 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 26 March 2014 15:23
> To: Paul Durrant
> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
> kernel; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> troubles "bisected"
> 
> 
> Wednesday, March 26, 2014, 3:44:42 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 26 March 2014 11:11
> >> To: Paul Durrant
> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> linux-
> >> kernel; netdev@vger.kernel.org
> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> >> troubles "bisected"
> >>
> >> Paul,
> >>
> >> You have been awfully silent for this whole thread while this is a
> regression
> >> caused by a patch of you
> >> (ca2f09f2b2c6c25047cfc545d057c4edfcfe561c as clearly stated much earlier
> in
> >> this thread).
> >>
> 
> > Sorry, I've been distracted...
> 
> >> The commit messages states:
> >>     "net_rx_action() is the place where we could do with an accurate
> >> predicition but,
> >>     since that has proven tricky to calculate, a cheap worse-case (but not
> too
> >> bad)
> >>     estimate is all we really need since the only thing we *must* prevent is
> >> xenvif_gop_skb()
> >>     consuming more slots than are available."
> >>
> >> Your "worst-case" calculation stated in the commit message is clearly not
> the
> >> worst case,
> >> since it doesn't take calls to "get_next_rx_buffer" into account.
> >>
> 
> > It should be taking into account the behaviour of start_new_rx_buffer(),
> which should be true if a slot is full or a frag will overflow the current slot and
> doesn't require splitting.
> > The code in net_rx_action() makes the assumption that each frag will
> require as many slots as its size requires, i.e. it assumes no packing of
> multiple frags into a single slot, so it should be a worst case.
> > Did I miss something in that logic?
> 
> Yes.
> In "xenvif_gop_skb()" this loop:
> 
>         for (i = 0; i < nr_frags; i++) {
>                 xenvif_gop_frag_copy(vif, skb, npo,
>                                      skb_frag_page(&skb_shinfo(skb)->frags[i]),
>                                      skb_frag_size(&skb_shinfo(skb)->frags[i]),
>                                      skb_shinfo(skb)->frags[i].page_offset,
>                                      &head);
>         }
> 
> Is capable of using up (at least) 1 slot more than is anticipated for in
> "net_rx_action()"  by this code:
> 
>                 for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
>                         unsigned int size;
>                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
>                         max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
>                 }
> 
> And this happens when it calls "get_next_rx_buffer()" from
> "xenvif_gop_frag_copy()" where it's breaking down the frag.
> 

The function that determines whether to consume another slot is start_new_rx_buffer() and for each frag I don't see why this would return true more than DIV_ROUND_UP(size, PAGE_SIZE) times. It may be called more times than that since the code in xenvif_gop_frag_copy() must also allow for the offset of the frag but should not return true in all cases. So, I still cannot see why a frag would ever consume more than DIV_ROUND_UP(size, PAGE_SIZE) slots.

  Paul

> Ultimately this results in bad grant reference warnings (and packets marked
> as "errors" in the interface statistics).
> 
> In my case it always seems to be a skb with 1 frag which is broken down in 5
> or 6 pieces ..
> 
> So "get_next_rx_buffer()" is called once .. and i'm overrunning the ring with
> 1 slot, but i'm not sure if that's not coincedence
> since in the code there seem to be no explicit limitation on how often this
> code path is taken. So perhaps it's implicitly limited
> since packets and frags can't be arbitrarily large in comparison with the
> page_size but that's not something i'm capable of figuring out :-)
> 
> 
> 
> >   Paul
> 
> >> Problem is that a worst case calculation would probably be reverting to
> the
> >> old calculation,
> >> and the problems this patch was trying to solve would reappear, but
> >> introducing new regressions
> >> isn't very useful either. And since it seems such a tricky and fragile thing to
> >> determine, it would
> >> probably be best to be split into a distinct function with a comment to
> explain
> >> the rationale used.
> >>
> >> Since this doesn't seem to progress very fast .. CC'ed some more folks ..
> you
> >> never know ..
> >>
> >> --
> >> Sander
> >>
> >>
> >> Tuesday, March 25, 2014, 4:29:42 PM, you wrote:
> >>
> >>
> >> > Tuesday, March 25, 2014, 4:15:39 PM, you wrote:
> >>
> >> >> On Sat, Mar 22, 2014 at 07:28:34PM +0100, Sander Eikelenboom wrote:
> >> >> [...]
> >> >>> > Yes there is only one frag .. but it seems to be much larger than
> >> PAGE_SIZE .. and xenvif_gop_frag_copy brakes that frag down into
> smaller
> >> bits .. hence the calculation in xenvif_rx_action determining the slots
> needed
> >> by doing:
> >> >>>
> >> >>> >                 for (i = 0; i < nr_frags; i++) {
> >> >>> >                         unsigned int size;
> >> >>> >                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
> >> >>> >                         max_slots_needed += DIV_ROUND_UP(size,
> PAGE_SIZE);
> >> >>> >                 }
> >> >>>
> >> >>> > But the code in xenvif_gop_frag_copy .. seems to be needing one
> >> more slot (from the emperical test) .. and calling "get_next_rx_buffer"
> >> seems involved in that ..
> >> >>>
> >> >>> Hmm looked again .. and it seems this is it .. when your frags are large
> >> enough you have the chance of running into this.
> >> >>>
> >>
> >> >> get_next_rx_buffer is guarded by start_new_rx_buffer. Do you see
> any
> >> >> problem with that implementation?
> >> > In general no, but "get_next_rx_buffer" up's cons .. and the calculations
> >> done in "xenvif_rx_action" for max_slots_needed to prevent the overrun
> >> > don't count in this possibility. So it's not the guarding of
> >> "start_new_rx_buffer" that is at fault. It's the ones early in
> >> "xenvif_rx_action".
> >> > The ones that were changed by Paul's patch from MAX_SKB_FRAGS to a
> >> calculated value that should be a "slim fit".
> >>
> >> > The problem is in determining upfront in "xenvif_rx_action" when and
> how
> >> often the "get_next_rx_buffer" path will be taken.
> >> > Unless there are other non direct restrictions (from a size point of view)
> it
> >> can be called multiple times per frag per skb.
> >>
> >> >>> Problem is .. i don't see an easy fix, the "one more slot" of the
> empirical
> >> test doesn't seem to be the worst case either (i think):
> >> >>>
> >> >>> - In my case the packets that hit this only have 1 frag, but i could have
> >> had more frags.
> >> >>>   I also think you can't rule out the possibility of doing the
> >> "get_next_rx_buffer" for multiple subsequent frags from one packet,
> >> >>>   so in the worst (and perhaps even from a single frag since it's looped
> >> over a split of it in what seems PAGE_SIZE pieces.)
> >> >>>   - So an exact calculation of how much slots we are going to need for
> >> hitting this "get_next_rx_buffer"  upfront in "xenvif_rx_action" seems
> >> unfeasible.
> >> >>>   - A worst case gamble seems impossible either .. if you take multiple
> >> frags * multiple times the "get_next_rx_buffer" ... you would probably be
> >> back at just
> >> >>>     setting the needed_slots to MAX_SKB_FRAGS.
> >> >>>
> >> >>> - Other thing would be checking for the available slots before doing
> the
> >> "get_next_rx_buffer" .. how ever .. we don't account for how many slots
> we
> >> still need to
> >> >>>   just process the remaining frags.
> >> >>>
> >>
> >> >> We've done a worst case estimation for whole SKB (linear area + all
> >> >> frags) already, at the very first beginning. That's what
> >> >> max_slots_needed is for.
> >>
> >> >>> - Just remove the whole "get_next_rx_buffer".. just tested it but it
> >> seems the "get_next_rx_buffer" is necessary ..  when i make
> >> start_new_rx_buffer always return false
> >> >>>   i hit the bug below :S
> >> >>>
> >> >>> So the questions are ... is the "get_next_rx_buffer" part really
> necessary
> >> ?
> >> >>>    - if not, what is the benefit of the "get_next_rx_buffer" and does
> that
> >> outweigh the additional cost of accounting
> >> >>>      of needed_slots for the frags that have yet to be processed ?
> >> >>>    - if yes, erhmmm what would be the best acceptable solution ..
> >> returning to using MAX_SKB_FRAGS ?
> >> >>>
> >>
> >> >> I think you need to answer several questions first:
> >> >> 1. is max_slots_needed actually large enough to cover whole SKB?
> >> >         No it's not if, you end up calling "get_next_rx_buffer" one or
> multiple
> >> times when processing the SKB
> >> >         you have the chance of overrunning (or be saved because prod gets
> >> upped before you overrun).
> >>
> >> >> 2. is the test of ring slot availability acurate?
> >> >         Seems to be.
> >>
> >> >> 3. is the number of ring slots consumed larger than
> max_slots_needed? (I
> >> >>    guess the answer is yes)
> >> >         Yes that was the whole point.
> >>
> >> >> 4. which step in the break-down procedure causes backend to overrun
> >> the
> >> >>    ring?
> >> >         The "get_next_rx_buffer" call .. that is not accounted for (which can
> be
> >> called
> >> >         multiple times per frag (unless some other none obvious size
> >> restriction limits this
> >> >         to one time per frag or one time per SKB).
> >> >         In my errorneous case it is called one time, but it would be nice if
> there
> >> would be some theoretical proof
> >> >         instead of just some emperical test.
> >>
> >>
> >> >> It doesn't matter how many frags your SKB has and how big a frag is. If
> >> >> every step is correct then you're fine. The code you're looking at
> >> >> (start_new_rx_buffer / get_next_rx_buffer and friend) needs to be
> >> >> carefully examined.
> >>
> >> > Well it seems it only calls "get_next_rx_buffer" in some special cases ..
> >> (and that also what i'm seeing because it doesn't happen
> >> > continously.
> >>
> >> > Question is shouldn't this max_needed_slots calc be reverted to what it
> >> was before 3.14 and take the time in 3.15 to figure out a
> >> > the theoretical max (if it can be calculated upfront) .. or another way to
> >> guarantee the code is able to process the whole SKB  ?
> >>
> >> >> Wei.
> >>
> >>
> >>
> >>
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 15:50                             ` Paul Durrant
@ 2014-03-26 16:06                               ` Sander Eikelenboom
  2014-03-26 16:25                                 ` Paul Durrant
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2014-03-26 16:06 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org


Wednesday, March 26, 2014, 4:50:30 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 26 March 2014 15:23
>> To: Paul Durrant
>> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
>> kernel; netdev@vger.kernel.org
>> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> troubles "bisected"
>> 
>> 
>> Wednesday, March 26, 2014, 3:44:42 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 26 March 2014 11:11
>> >> To: Paul Durrant
>> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> linux-
>> >> kernel; netdev@vger.kernel.org
>> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> >> troubles "bisected"
>> >>
>> >> Paul,
>> >>
>> >> You have been awfully silent for this whole thread while this is a
>> regression
>> >> caused by a patch of you
>> >> (ca2f09f2b2c6c25047cfc545d057c4edfcfe561c as clearly stated much earlier
>> in
>> >> this thread).
>> >>
>> 
>> > Sorry, I've been distracted...
>> 
>> >> The commit messages states:
>> >>     "net_rx_action() is the place where we could do with an accurate
>> >> predicition but,
>> >>     since that has proven tricky to calculate, a cheap worse-case (but not
>> too
>> >> bad)
>> >>     estimate is all we really need since the only thing we *must* prevent is
>> >> xenvif_gop_skb()
>> >>     consuming more slots than are available."
>> >>
>> >> Your "worst-case" calculation stated in the commit message is clearly not
>> the
>> >> worst case,
>> >> since it doesn't take calls to "get_next_rx_buffer" into account.
>> >>
>> 
>> > It should be taking into account the behaviour of start_new_rx_buffer(),
>> which should be true if a slot is full or a frag will overflow the current slot and
>> doesn't require splitting.
>> > The code in net_rx_action() makes the assumption that each frag will
>> require as many slots as its size requires, i.e. it assumes no packing of
>> multiple frags into a single slot, so it should be a worst case.
>> > Did I miss something in that logic?
>> 
>> Yes.
>> In "xenvif_gop_skb()" this loop:
>> 
>>         for (i = 0; i < nr_frags; i++) {
>>                 xenvif_gop_frag_copy(vif, skb, npo,
>>                                      skb_frag_page(&skb_shinfo(skb)->frags[i]),
>>                                      skb_frag_size(&skb_shinfo(skb)->frags[i]),
>>                                      skb_shinfo(skb)->frags[i].page_offset,
>>                                      &head);
>>         }
>> 
>> Is capable of using up (at least) 1 slot more than is anticipated for in
>> "net_rx_action()"  by this code:
>> 
>>                 for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
>>                         unsigned int size;
>>                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
>>                         max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
>>                 }
>> 
>> And this happens when it calls "get_next_rx_buffer()" from
>> "xenvif_gop_frag_copy()" where it's breaking down the frag.
>> 

> The function that determines whether to consume another slot is start_new_rx_buffer() and for each frag I don't see why this would return true more than DIV_ROUND_UP(size, PAGE_SIZE) times.
> It may be called more times than that since the code in xenvif_gop_frag_copy() must also allow for the offset of the frag but should not return true in all cases.
> So, I still cannot see why a frag would ever consume more than DIV_ROUND_UP(size, PAGE_SIZE) slots.

Well here a case were a frag is broken down in 2 pieces:

[ 1156.870372] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here 1  npo->meta_prod:39 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105867 npo->copy_gref:760  npo->copy_off:4096  MAX_BUFFER_OFFSET:4096 bytes:560 size:560  offset:0 head:1273462060 i:2 vif->rx.sring->req_event:2104275 estimated_slots_needed:4 reserved_slots_left:0
[ 1156.871971] vif vif-7-0 vif7.0: ?!? xenvif_start_xmit stopping queue !  min_slots_needed:1 min_slots_needed_2:0 min_slots_needed_3:0 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105867 vif->rx.sring->req_event:2105868 skb->len:66 skb->data_len:0
[ 1156.964316] vif vif-7-0 vif7.0: ?!? get_next_rx_buffer before req npo->meta_prod:39 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105867 vif->rx.sring->req_event:2105868, reserved_slots_left:0
[ 1157.001635] vif vif-7-0 vif7.0: ?!? get_next_rx_buffer after req npo->meta_prod:39 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 req->gref:4325379 req->id:11 vif->rx.sring->req_event:2105868 reserved_slots_left:-1
[ 1157.039095] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here 2  npo->meta_prod:40 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 npo->copy_gref:4325379  npo->copy_off:0  MAX_BUFFER_OFFSET:4096 bytes:560 size:560  offset:0 head:1273462060 i:2 vif->rx.sring->req_event:2105868 estimated_slots_needed:4 reserved_slots_left:-1
[ 1157.095216] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here end npo->meta_prod:40 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 npo->copy_gref:4325379 npo->copy_off:560  MAX_BUFFER_OFFSET:4096 bytes:560 size:0  offset:560 head:1273462060 i:3 vif->rx.sring->req_event:2105868 gso_gaps:0 estimated_slots_needed:4 reserved_slots_left:-1
[ 1157.151338] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 4 after npo->meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1 req->gref:657 req->id:7 estimated_slots_needed:4 i(frag):0 j(data):1 reserved_slots_left:-1
[ 1157.188908] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 5 npo->meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1 req->gref:657 req->id:7 estimated_slots_needed:4 j(data):1 reserved_slots_left:-1    used in funcstart: 0 + 1 .. used_dataloop:1 .. used_fragloop:3
[ 1157.244975] vif vif-7-0 vif7.0: ?!? xenvif_rx_action me here 2 ..  vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 sco->meta_slots_used:4 max_upped_gso:1 skb_is_gso(skb):1 max_slots_needed:4 j:6 is_gso:1 nr_frags:1 firstpart:1 secondpart:2 reserved_slots_left:-1

- When processing an SKB we end up in "xenvif_gop_frag_copy" while prod == cons ... but we still have bytes and size left ..
- start_new_rx_buffer() has returned true ..
- so we end up in get_next_rx_buffer
- this does a RING_GET_REQUEST and ups cons ..
- and we end up with a bad grant reference.

Sometimes we are saved by the bell .. since additional slots have become free (you see cons become > prod in "get_next_rx_buffer" but shortly after that prod is increased ..
just in time to not cause a overrun).

If you need additional / other info, please cook up a debug patch with what you need.

--
Sander






>   Paul

>> Ultimately this results in bad grant reference warnings (and packets marked
>> as "errors" in the interface statistics).
>> 
>> In my case it always seems to be a skb with 1 frag which is broken down in 5
>> or 6 pieces ..
>> 
>> So "get_next_rx_buffer()" is called once .. and i'm overrunning the ring with
>> 1 slot, but i'm not sure if that's not coincedence
>> since in the code there seem to be no explicit limitation on how often this
>> code path is taken. So perhaps it's implicitly limited
>> since packets and frags can't be arbitrarily large in comparison with the
>> page_size but that's not something i'm capable of figuring out :-)
>> 
>> 
>> 
>> >   Paul
>> 
>> >> Problem is that a worst case calculation would probably be reverting to
>> the
>> >> old calculation,
>> >> and the problems this patch was trying to solve would reappear, but
>> >> introducing new regressions
>> >> isn't very useful either. And since it seems such a tricky and fragile thing to
>> >> determine, it would
>> >> probably be best to be split into a distinct function with a comment to
>> explain
>> >> the rationale used.
>> >>
>> >> Since this doesn't seem to progress very fast .. CC'ed some more folks ..
>> you
>> >> never know ..
>> >>
>> >> --
>> >> Sander
>> >>
>> >>
>> >> Tuesday, March 25, 2014, 4:29:42 PM, you wrote:
>> >>
>> >>
>> >> > Tuesday, March 25, 2014, 4:15:39 PM, you wrote:
>> >>
>> >> >> On Sat, Mar 22, 2014 at 07:28:34PM +0100, Sander Eikelenboom wrote:
>> >> >> [...]
>> >> >>> > Yes there is only one frag .. but it seems to be much larger than
>> >> PAGE_SIZE .. and xenvif_gop_frag_copy brakes that frag down into
>> smaller
>> >> bits .. hence the calculation in xenvif_rx_action determining the slots
>> needed
>> >> by doing:
>> >> >>>
>> >> >>> >                 for (i = 0; i < nr_frags; i++) {
>> >> >>> >                         unsigned int size;
>> >> >>> >                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
>> >> >>> >                         max_slots_needed += DIV_ROUND_UP(size,
>> PAGE_SIZE);
>> >> >>> >                 }
>> >> >>>
>> >> >>> > But the code in xenvif_gop_frag_copy .. seems to be needing one
>> >> more slot (from the emperical test) .. and calling "get_next_rx_buffer"
>> >> seems involved in that ..
>> >> >>>
>> >> >>> Hmm looked again .. and it seems this is it .. when your frags are large
>> >> enough you have the chance of running into this.
>> >> >>>
>> >>
>> >> >> get_next_rx_buffer is guarded by start_new_rx_buffer. Do you see
>> any
>> >> >> problem with that implementation?
>> >> > In general no, but "get_next_rx_buffer" up's cons .. and the calculations
>> >> done in "xenvif_rx_action" for max_slots_needed to prevent the overrun
>> >> > don't count in this possibility. So it's not the guarding of
>> >> "start_new_rx_buffer" that is at fault. It's the ones early in
>> >> "xenvif_rx_action".
>> >> > The ones that were changed by Paul's patch from MAX_SKB_FRAGS to a
>> >> calculated value that should be a "slim fit".
>> >>
>> >> > The problem is in determining upfront in "xenvif_rx_action" when and
>> how
>> >> often the "get_next_rx_buffer" path will be taken.
>> >> > Unless there are other non direct restrictions (from a size point of view)
>> it
>> >> can be called multiple times per frag per skb.
>> >>
>> >> >>> Problem is .. i don't see an easy fix, the "one more slot" of the
>> empirical
>> >> test doesn't seem to be the worst case either (i think):
>> >> >>>
>> >> >>> - In my case the packets that hit this only have 1 frag, but i could have
>> >> had more frags.
>> >> >>>   I also think you can't rule out the possibility of doing the
>> >> "get_next_rx_buffer" for multiple subsequent frags from one packet,
>> >> >>>   so in the worst (and perhaps even from a single frag since it's looped
>> >> over a split of it in what seems PAGE_SIZE pieces.)
>> >> >>>   - So an exact calculation of how much slots we are going to need for
>> >> hitting this "get_next_rx_buffer"  upfront in "xenvif_rx_action" seems
>> >> unfeasible.
>> >> >>>   - A worst case gamble seems impossible either .. if you take multiple
>> >> frags * multiple times the "get_next_rx_buffer" ... you would probably be
>> >> back at just
>> >> >>>     setting the needed_slots to MAX_SKB_FRAGS.
>> >> >>>
>> >> >>> - Other thing would be checking for the available slots before doing
>> the
>> >> "get_next_rx_buffer" .. how ever .. we don't account for how many slots
>> we
>> >> still need to
>> >> >>>   just process the remaining frags.
>> >> >>>
>> >>
>> >> >> We've done a worst case estimation for whole SKB (linear area + all
>> >> >> frags) already, at the very first beginning. That's what
>> >> >> max_slots_needed is for.
>> >>
>> >> >>> - Just remove the whole "get_next_rx_buffer".. just tested it but it
>> >> seems the "get_next_rx_buffer" is necessary ..  when i make
>> >> start_new_rx_buffer always return false
>> >> >>>   i hit the bug below :S
>> >> >>>
>> >> >>> So the questions are ... is the "get_next_rx_buffer" part really
>> necessary
>> >> ?
>> >> >>>    - if not, what is the benefit of the "get_next_rx_buffer" and does
>> that
>> >> outweigh the additional cost of accounting
>> >> >>>      of needed_slots for the frags that have yet to be processed ?
>> >> >>>    - if yes, erhmmm what would be the best acceptable solution ..
>> >> returning to using MAX_SKB_FRAGS ?
>> >> >>>
>> >>
>> >> >> I think you need to answer several questions first:
>> >> >> 1. is max_slots_needed actually large enough to cover whole SKB?
>> >> >         No it's not if, you end up calling "get_next_rx_buffer" one or
>> multiple
>> >> times when processing the SKB
>> >> >         you have the chance of overrunning (or be saved because prod gets
>> >> upped before you overrun).
>> >>
>> >> >> 2. is the test of ring slot availability acurate?
>> >> >         Seems to be.
>> >>
>> >> >> 3. is the number of ring slots consumed larger than
>> max_slots_needed? (I
>> >> >>    guess the answer is yes)
>> >> >         Yes that was the whole point.
>> >>
>> >> >> 4. which step in the break-down procedure causes backend to overrun
>> >> the
>> >> >>    ring?
>> >> >         The "get_next_rx_buffer" call .. that is not accounted for (which can
>> be
>> >> called
>> >> >         multiple times per frag (unless some other none obvious size
>> >> restriction limits this
>> >> >         to one time per frag or one time per SKB).
>> >> >         In my errorneous case it is called one time, but it would be nice if
>> there
>> >> would be some theoretical proof
>> >> >         instead of just some emperical test.
>> >>
>> >>
>> >> >> It doesn't matter how many frags your SKB has and how big a frag is. If
>> >> >> every step is correct then you're fine. The code you're looking at
>> >> >> (start_new_rx_buffer / get_next_rx_buffer and friend) needs to be
>> >> >> carefully examined.
>> >>
>> >> > Well it seems it only calls "get_next_rx_buffer" in some special cases ..
>> >> (and that also what i'm seeing because it doesn't happen
>> >> > continously.
>> >>
>> >> > Question is shouldn't this max_needed_slots calc be reverted to what it
>> >> was before 3.14 and take the time in 3.15 to figure out a
>> >> > the theoretical max (if it can be calculated upfront) .. or another way to
>> >> guarantee the code is able to process the whole SKB  ?
>> >>
>> >> >> Wei.
>> >>
>> >>
>> >>
>> >>
>> 
>> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 16:06                               ` Sander Eikelenboom
@ 2014-03-26 16:25                                 ` Paul Durrant
  2014-03-26 16:53                                   ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Paul Durrant @ 2014-03-26 16:25 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 26 March 2014 16:07
> To: Paul Durrant
> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
> kernel; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> troubles "bisected"
> 
> 
> Wednesday, March 26, 2014, 4:50:30 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 26 March 2014 15:23
> >> To: Paul Durrant
> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> linux-
> >> kernel; netdev@vger.kernel.org
> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> >> troubles "bisected"
> >>
> >>
> >> Wednesday, March 26, 2014, 3:44:42 PM, you wrote:
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 26 March 2014 11:11
> >> >> To: Paul Durrant
> >> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> >> linux-
> >> >> kernel; netdev@vger.kernel.org
> >> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> >> >> troubles "bisected"
> >> >>
> >> >> Paul,
> >> >>
> >> >> You have been awfully silent for this whole thread while this is a
> >> regression
> >> >> caused by a patch of you
> >> >> (ca2f09f2b2c6c25047cfc545d057c4edfcfe561c as clearly stated much
> earlier
> >> in
> >> >> this thread).
> >> >>
> >>
> >> > Sorry, I've been distracted...
> >>
> >> >> The commit messages states:
> >> >>     "net_rx_action() is the place where we could do with an accurate
> >> >> predicition but,
> >> >>     since that has proven tricky to calculate, a cheap worse-case (but not
> >> too
> >> >> bad)
> >> >>     estimate is all we really need since the only thing we *must* prevent
> is
> >> >> xenvif_gop_skb()
> >> >>     consuming more slots than are available."
> >> >>
> >> >> Your "worst-case" calculation stated in the commit message is clearly
> not
> >> the
> >> >> worst case,
> >> >> since it doesn't take calls to "get_next_rx_buffer" into account.
> >> >>
> >>
> >> > It should be taking into account the behaviour of
> start_new_rx_buffer(),
> >> which should be true if a slot is full or a frag will overflow the current slot
> and
> >> doesn't require splitting.
> >> > The code in net_rx_action() makes the assumption that each frag will
> >> require as many slots as its size requires, i.e. it assumes no packing of
> >> multiple frags into a single slot, so it should be a worst case.
> >> > Did I miss something in that logic?
> >>
> >> Yes.
> >> In "xenvif_gop_skb()" this loop:
> >>
> >>         for (i = 0; i < nr_frags; i++) {
> >>                 xenvif_gop_frag_copy(vif, skb, npo,
> >>                                      skb_frag_page(&skb_shinfo(skb)->frags[i]),
> >>                                      skb_frag_size(&skb_shinfo(skb)->frags[i]),
> >>                                      skb_shinfo(skb)->frags[i].page_offset,
> >>                                      &head);
> >>         }
> >>
> >> Is capable of using up (at least) 1 slot more than is anticipated for in
> >> "net_rx_action()"  by this code:
> >>
> >>                 for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
> >>                         unsigned int size;
> >>                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
> >>                         max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
> >>                 }
> >>
> >> And this happens when it calls "get_next_rx_buffer()" from
> >> "xenvif_gop_frag_copy()" where it's breaking down the frag.
> >>
> 
> > The function that determines whether to consume another slot is
> start_new_rx_buffer() and for each frag I don't see why this would return
> true more than DIV_ROUND_UP(size, PAGE_SIZE) times.
> > It may be called more times than that since the code in
> xenvif_gop_frag_copy() must also allow for the offset of the frag but should
> not return true in all cases.
> > So, I still cannot see why a frag would ever consume more than
> DIV_ROUND_UP(size, PAGE_SIZE) slots.
> 
> Well here a case were a frag is broken down in 2 pieces:
> 
> [ 1156.870372] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here 1  npo-
> >meta_prod:39 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105867
> npo->copy_gref:760  npo->copy_off:4096  MAX_BUFFER_OFFSET:4096
> bytes:560 size:560  offset:0 head:1273462060 i:2 vif->rx.sring-
> >req_event:2104275 estimated_slots_needed:4 reserved_slots_left:0
> [ 1156.871971] vif vif-7-0 vif7.0: ?!? xenvif_start_xmit stopping queue !
> min_slots_needed:1 min_slots_needed_2:0 min_slots_needed_3:0 vif-
> >rx.sring->req_prod:2105867 vif->rx.req_cons:2105867 vif->rx.sring-
> >req_event:2105868 skb->len:66 skb->data_len:0
> [ 1156.964316] vif vif-7-0 vif7.0: ?!? get_next_rx_buffer before req npo-
> >meta_prod:39 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105867
> vif->rx.sring->req_event:2105868, reserved_slots_left:0
> [ 1157.001635] vif vif-7-0 vif7.0: ?!? get_next_rx_buffer after req npo-
> >meta_prod:39 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868
> req->gref:4325379 req->id:11 vif->rx.sring->req_event:2105868
> reserved_slots_left:-1
> [ 1157.039095] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here 2  npo-
> >meta_prod:40 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868
> npo->copy_gref:4325379  npo->copy_off:0  MAX_BUFFER_OFFSET:4096
> bytes:560 size:560  offset:0 head:1273462060 i:2 vif->rx.sring-
> >req_event:2105868 estimated_slots_needed:4 reserved_slots_left:-1
> [ 1157.095216] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here end npo-
> >meta_prod:40 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868
> npo->copy_gref:4325379 npo->copy_off:560  MAX_BUFFER_OFFSET:4096
> bytes:560 size:0  offset:560 head:1273462060 i:3 vif->rx.sring-
> >req_event:2105868 gso_gaps:0 estimated_slots_needed:4
> reserved_slots_left:-1
> [ 1157.151338] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 4 after npo-
> >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
> >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
> req->gref:657 req->id:7 estimated_slots_needed:4 i(frag):0 j(data):1
> reserved_slots_left:-1
> [ 1157.188908] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 5 npo-
> >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
> >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
> req->gref:657 req->id:7 estimated_slots_needed:4 j(data):1
> reserved_slots_left:-1    used in funcstart: 0 + 1 .. used_dataloop:1 ..
> used_fragloop:3
> [ 1157.244975] vif vif-7-0 vif7.0: ?!? xenvif_rx_action me here 2 ..  vif-
> >rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 sco-
> >meta_slots_used:4 max_upped_gso:1 skb_is_gso(skb):1
> max_slots_needed:4 j:6 is_gso:1 nr_frags:1 firstpart:1 secondpart:2
> reserved_slots_left:-1
> 
> - When processing an SKB we end up in "xenvif_gop_frag_copy" while prod
> == cons ... but we still have bytes and size left ..
> - start_new_rx_buffer() has returned true ..
> - so we end up in get_next_rx_buffer
> - this does a RING_GET_REQUEST and ups cons ..
> - and we end up with a bad grant reference.
> 
> Sometimes we are saved by the bell .. since additional slots have become
> free (you see cons become > prod in "get_next_rx_buffer" but shortly after
> that prod is increased ..
> just in time to not cause a overrun).
> 

Ah, but hang on... There's a BUG_ON meta_slots_used > max_slots_needed, so if we are overflowing the worst-case calculation then why is that BUG_ON not firing?

  Paul

> If you need additional / other info, please cook up a debug patch with what
> you need.
> 
> --
> Sander
> 
> 
> 
> 
> 
> 
> >   Paul
> 
> >> Ultimately this results in bad grant reference warnings (and packets
> marked
> >> as "errors" in the interface statistics).
> >>
> >> In my case it always seems to be a skb with 1 frag which is broken down in
> 5
> >> or 6 pieces ..
> >>
> >> So "get_next_rx_buffer()" is called once .. and i'm overrunning the ring
> with
> >> 1 slot, but i'm not sure if that's not coincedence
> >> since in the code there seem to be no explicit limitation on how often this
> >> code path is taken. So perhaps it's implicitly limited
> >> since packets and frags can't be arbitrarily large in comparison with the
> >> page_size but that's not something i'm capable of figuring out :-)
> >>
> >>
> >>
> >> >   Paul
> >>
> >> >> Problem is that a worst case calculation would probably be reverting to
> >> the
> >> >> old calculation,
> >> >> and the problems this patch was trying to solve would reappear, but
> >> >> introducing new regressions
> >> >> isn't very useful either. And since it seems such a tricky and fragile
> thing to
> >> >> determine, it would
> >> >> probably be best to be split into a distinct function with a comment to
> >> explain
> >> >> the rationale used.
> >> >>
> >> >> Since this doesn't seem to progress very fast .. CC'ed some more folks
> ..
> >> you
> >> >> never know ..
> >> >>
> >> >> --
> >> >> Sander
> >> >>
> >> >>
> >> >> Tuesday, March 25, 2014, 4:29:42 PM, you wrote:
> >> >>
> >> >>
> >> >> > Tuesday, March 25, 2014, 4:15:39 PM, you wrote:
> >> >>
> >> >> >> On Sat, Mar 22, 2014 at 07:28:34PM +0100, Sander Eikelenboom
> wrote:
> >> >> >> [...]
> >> >> >>> > Yes there is only one frag .. but it seems to be much larger than
> >> >> PAGE_SIZE .. and xenvif_gop_frag_copy brakes that frag down into
> >> smaller
> >> >> bits .. hence the calculation in xenvif_rx_action determining the slots
> >> needed
> >> >> by doing:
> >> >> >>>
> >> >> >>> >                 for (i = 0; i < nr_frags; i++) {
> >> >> >>> >                         unsigned int size;
> >> >> >>> >                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
> >> >> >>> >                         max_slots_needed += DIV_ROUND_UP(size,
> >> PAGE_SIZE);
> >> >> >>> >                 }
> >> >> >>>
> >> >> >>> > But the code in xenvif_gop_frag_copy .. seems to be needing
> one
> >> >> more slot (from the emperical test) .. and calling "get_next_rx_buffer"
> >> >> seems involved in that ..
> >> >> >>>
> >> >> >>> Hmm looked again .. and it seems this is it .. when your frags are
> large
> >> >> enough you have the chance of running into this.
> >> >> >>>
> >> >>
> >> >> >> get_next_rx_buffer is guarded by start_new_rx_buffer. Do you
> see
> >> any
> >> >> >> problem with that implementation?
> >> >> > In general no, but "get_next_rx_buffer" up's cons .. and the
> calculations
> >> >> done in "xenvif_rx_action" for max_slots_needed to prevent the
> overrun
> >> >> > don't count in this possibility. So it's not the guarding of
> >> >> "start_new_rx_buffer" that is at fault. It's the ones early in
> >> >> "xenvif_rx_action".
> >> >> > The ones that were changed by Paul's patch from MAX_SKB_FRAGS
> to a
> >> >> calculated value that should be a "slim fit".
> >> >>
> >> >> > The problem is in determining upfront in "xenvif_rx_action" when
> and
> >> how
> >> >> often the "get_next_rx_buffer" path will be taken.
> >> >> > Unless there are other non direct restrictions (from a size point of
> view)
> >> it
> >> >> can be called multiple times per frag per skb.
> >> >>
> >> >> >>> Problem is .. i don't see an easy fix, the "one more slot" of the
> >> empirical
> >> >> test doesn't seem to be the worst case either (i think):
> >> >> >>>
> >> >> >>> - In my case the packets that hit this only have 1 frag, but i could
> have
> >> >> had more frags.
> >> >> >>>   I also think you can't rule out the possibility of doing the
> >> >> "get_next_rx_buffer" for multiple subsequent frags from one packet,
> >> >> >>>   so in the worst (and perhaps even from a single frag since it's
> looped
> >> >> over a split of it in what seems PAGE_SIZE pieces.)
> >> >> >>>   - So an exact calculation of how much slots we are going to need
> for
> >> >> hitting this "get_next_rx_buffer"  upfront in "xenvif_rx_action" seems
> >> >> unfeasible.
> >> >> >>>   - A worst case gamble seems impossible either .. if you take
> multiple
> >> >> frags * multiple times the "get_next_rx_buffer" ... you would probably
> be
> >> >> back at just
> >> >> >>>     setting the needed_slots to MAX_SKB_FRAGS.
> >> >> >>>
> >> >> >>> - Other thing would be checking for the available slots before
> doing
> >> the
> >> >> "get_next_rx_buffer" .. how ever .. we don't account for how many
> slots
> >> we
> >> >> still need to
> >> >> >>>   just process the remaining frags.
> >> >> >>>
> >> >>
> >> >> >> We've done a worst case estimation for whole SKB (linear area + all
> >> >> >> frags) already, at the very first beginning. That's what
> >> >> >> max_slots_needed is for.
> >> >>
> >> >> >>> - Just remove the whole "get_next_rx_buffer".. just tested it but
> it
> >> >> seems the "get_next_rx_buffer" is necessary ..  when i make
> >> >> start_new_rx_buffer always return false
> >> >> >>>   i hit the bug below :S
> >> >> >>>
> >> >> >>> So the questions are ... is the "get_next_rx_buffer" part really
> >> necessary
> >> >> ?
> >> >> >>>    - if not, what is the benefit of the "get_next_rx_buffer" and
> does
> >> that
> >> >> outweigh the additional cost of accounting
> >> >> >>>      of needed_slots for the frags that have yet to be processed ?
> >> >> >>>    - if yes, erhmmm what would be the best acceptable solution ..
> >> >> returning to using MAX_SKB_FRAGS ?
> >> >> >>>
> >> >>
> >> >> >> I think you need to answer several questions first:
> >> >> >> 1. is max_slots_needed actually large enough to cover whole SKB?
> >> >> >         No it's not if, you end up calling "get_next_rx_buffer" one or
> >> multiple
> >> >> times when processing the SKB
> >> >> >         you have the chance of overrunning (or be saved because prod
> gets
> >> >> upped before you overrun).
> >> >>
> >> >> >> 2. is the test of ring slot availability acurate?
> >> >> >         Seems to be.
> >> >>
> >> >> >> 3. is the number of ring slots consumed larger than
> >> max_slots_needed? (I
> >> >> >>    guess the answer is yes)
> >> >> >         Yes that was the whole point.
> >> >>
> >> >> >> 4. which step in the break-down procedure causes backend to
> overrun
> >> >> the
> >> >> >>    ring?
> >> >> >         The "get_next_rx_buffer" call .. that is not accounted for (which
> can
> >> be
> >> >> called
> >> >> >         multiple times per frag (unless some other none obvious size
> >> >> restriction limits this
> >> >> >         to one time per frag or one time per SKB).
> >> >> >         In my errorneous case it is called one time, but it would be nice if
> >> there
> >> >> would be some theoretical proof
> >> >> >         instead of just some emperical test.
> >> >>
> >> >>
> >> >> >> It doesn't matter how many frags your SKB has and how big a frag
> is. If
> >> >> >> every step is correct then you're fine. The code you're looking at
> >> >> >> (start_new_rx_buffer / get_next_rx_buffer and friend) needs to
> be
> >> >> >> carefully examined.
> >> >>
> >> >> > Well it seems it only calls "get_next_rx_buffer" in some special cases
> ..
> >> >> (and that also what i'm seeing because it doesn't happen
> >> >> > continously.
> >> >>
> >> >> > Question is shouldn't this max_needed_slots calc be reverted to
> what it
> >> >> was before 3.14 and take the time in 3.15 to figure out a
> >> >> > the theoretical max (if it can be calculated upfront) .. or another way
> to
> >> >> guarantee the code is able to process the whole SKB  ?
> >> >>
> >> >> >> Wei.
> >> >>
> >> >>
> >> >>
> >> >>
> >>
> >>
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 16:25                                 ` Paul Durrant
@ 2014-03-26 16:53                                   ` Sander Eikelenboom
  2014-03-26 17:16                                     ` Paul Durrant
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2014-03-26 16:53 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org


Wednesday, March 26, 2014, 5:25:21 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 26 March 2014 16:07
>> To: Paul Durrant
>> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
>> kernel; netdev@vger.kernel.org
>> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> troubles "bisected"
>> 
>> 
>> Wednesday, March 26, 2014, 4:50:30 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 26 March 2014 15:23
>> >> To: Paul Durrant
>> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> linux-
>> >> kernel; netdev@vger.kernel.org
>> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> >> troubles "bisected"
>> >>
>> >>
>> >> Wednesday, March 26, 2014, 3:44:42 PM, you wrote:
>> >>
>> >> >> -----Original Message-----
>> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> >> Sent: 26 March 2014 11:11
>> >> >> To: Paul Durrant
>> >> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> >> linux-
>> >> >> kernel; netdev@vger.kernel.org
>> >> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> >> >> troubles "bisected"
>> >> >>
>> >> >> Paul,
>> >> >>
>> >> >> You have been awfully silent for this whole thread while this is a
>> >> regression
>> >> >> caused by a patch of you
>> >> >> (ca2f09f2b2c6c25047cfc545d057c4edfcfe561c as clearly stated much
>> earlier
>> >> in
>> >> >> this thread).
>> >> >>
>> >>
>> >> > Sorry, I've been distracted...
>> >>
>> >> >> The commit messages states:
>> >> >>     "net_rx_action() is the place where we could do with an accurate
>> >> >> predicition but,
>> >> >>     since that has proven tricky to calculate, a cheap worse-case (but not
>> >> too
>> >> >> bad)
>> >> >>     estimate is all we really need since the only thing we *must* prevent
>> is
>> >> >> xenvif_gop_skb()
>> >> >>     consuming more slots than are available."
>> >> >>
>> >> >> Your "worst-case" calculation stated in the commit message is clearly
>> not
>> >> the
>> >> >> worst case,
>> >> >> since it doesn't take calls to "get_next_rx_buffer" into account.
>> >> >>
>> >>
>> >> > It should be taking into account the behaviour of
>> start_new_rx_buffer(),
>> >> which should be true if a slot is full or a frag will overflow the current slot
>> and
>> >> doesn't require splitting.
>> >> > The code in net_rx_action() makes the assumption that each frag will
>> >> require as many slots as its size requires, i.e. it assumes no packing of
>> >> multiple frags into a single slot, so it should be a worst case.
>> >> > Did I miss something in that logic?
>> >>
>> >> Yes.
>> >> In "xenvif_gop_skb()" this loop:
>> >>
>> >>         for (i = 0; i < nr_frags; i++) {
>> >>                 xenvif_gop_frag_copy(vif, skb, npo,
>> >>                                      skb_frag_page(&skb_shinfo(skb)->frags[i]),
>> >>                                      skb_frag_size(&skb_shinfo(skb)->frags[i]),
>> >>                                      skb_shinfo(skb)->frags[i].page_offset,
>> >>                                      &head);
>> >>         }
>> >>
>> >> Is capable of using up (at least) 1 slot more than is anticipated for in
>> >> "net_rx_action()"  by this code:
>> >>
>> >>                 for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
>> >>                         unsigned int size;
>> >>                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
>> >>                         max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
>> >>                 }
>> >>
>> >> And this happens when it calls "get_next_rx_buffer()" from
>> >> "xenvif_gop_frag_copy()" where it's breaking down the frag.
>> >>
>> 
>> > The function that determines whether to consume another slot is
>> start_new_rx_buffer() and for each frag I don't see why this would return
>> true more than DIV_ROUND_UP(size, PAGE_SIZE) times.
>> > It may be called more times than that since the code in
>> xenvif_gop_frag_copy() must also allow for the offset of the frag but should
>> not return true in all cases.
>> > So, I still cannot see why a frag would ever consume more than
>> DIV_ROUND_UP(size, PAGE_SIZE) slots.
>> 
>> Well here a case were a frag is broken down in 2 pieces:
>> 
>> [ 1156.870372] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here 1  npo-
>> >meta_prod:39 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105867
>> npo->copy_gref:760  npo->copy_off:4096  MAX_BUFFER_OFFSET:4096
>> bytes:560 size:560  offset:0 head:1273462060 i:2 vif->rx.sring-
>> >req_event:2104275 estimated_slots_needed:4 reserved_slots_left:0
>> [ 1156.871971] vif vif-7-0 vif7.0: ?!? xenvif_start_xmit stopping queue !
>> min_slots_needed:1 min_slots_needed_2:0 min_slots_needed_3:0 vif-
>> >rx.sring->req_prod:2105867 vif->rx.req_cons:2105867 vif->rx.sring-
>> >req_event:2105868 skb->len:66 skb->data_len:0
>> [ 1156.964316] vif vif-7-0 vif7.0: ?!? get_next_rx_buffer before req npo-
>> >meta_prod:39 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105867
>> vif->rx.sring->req_event:2105868, reserved_slots_left:0
>> [ 1157.001635] vif vif-7-0 vif7.0: ?!? get_next_rx_buffer after req npo-
>> >meta_prod:39 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868
>> req->gref:4325379 req->id:11 vif->rx.sring->req_event:2105868
>> reserved_slots_left:-1
>> [ 1157.039095] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here 2  npo-
>> >meta_prod:40 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868
>> npo->copy_gref:4325379  npo->copy_off:0  MAX_BUFFER_OFFSET:4096
>> bytes:560 size:560  offset:0 head:1273462060 i:2 vif->rx.sring-
>> >req_event:2105868 estimated_slots_needed:4 reserved_slots_left:-1
>> [ 1157.095216] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here end npo-
>> >meta_prod:40 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868
>> npo->copy_gref:4325379 npo->copy_off:560  MAX_BUFFER_OFFSET:4096
>> bytes:560 size:0  offset:560 head:1273462060 i:3 vif->rx.sring-
>> >req_event:2105868 gso_gaps:0 estimated_slots_needed:4
>> reserved_slots_left:-1
>> [ 1157.151338] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 4 after npo-
>> >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
>> >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
>> req->gref:657 req->id:7 estimated_slots_needed:4 i(frag):0 j(data):1
>> reserved_slots_left:-1
>> [ 1157.188908] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 5 npo-
>> >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
>> >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
>> req->gref:657 req->id:7 estimated_slots_needed:4 j(data):1
>> reserved_slots_left:-1    used in funcstart: 0 + 1 .. used_dataloop:1 ..
>> used_fragloop:3
>> [ 1157.244975] vif vif-7-0 vif7.0: ?!? xenvif_rx_action me here 2 ..  vif-
>> >rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 sco-
>> >meta_slots_used:4 max_upped_gso:1 skb_is_gso(skb):1
>> max_slots_needed:4 j:6 is_gso:1 nr_frags:1 firstpart:1 secondpart:2
>> reserved_slots_left:-1
>> 
>> - When processing an SKB we end up in "xenvif_gop_frag_copy" while prod
>> == cons ... but we still have bytes and size left ..
>> - start_new_rx_buffer() has returned true ..
>> - so we end up in get_next_rx_buffer
>> - this does a RING_GET_REQUEST and ups cons ..
>> - and we end up with a bad grant reference.
>> 
>> Sometimes we are saved by the bell .. since additional slots have become
>> free (you see cons become > prod in "get_next_rx_buffer" but shortly after
>> that prod is increased ..
>> just in time to not cause a overrun).
>> 

> Ah, but hang on... There's a BUG_ON meta_slots_used > max_slots_needed, so if we are overflowing the worst-case calculation then why is that BUG_ON not firing?

You mean:
                sco = (struct skb_cb_overlay *)skb->cb;
                sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
                BUG_ON(sco->meta_slots_used > max_slots_needed);

in "get_next_rx_buffer" ?

I don't know .. at least now it doesn't crash dom0 and therefore not my complete machine and since tcp is recovering from a failed packet  :-)

But probably because "npo->copy_prod++" seems to be used for the frags .. and it isn't added to  npo->meta_prod ?

--
Sander

>   Paul

>> If you need additional / other info, please cook up a debug patch with what
>> you need.
>> 
>> --
>> Sander
>> 
>> 
>> 
>> 
>> 
>> 
>> >   Paul
>> 
>> >> Ultimately this results in bad grant reference warnings (and packets
>> marked
>> >> as "errors" in the interface statistics).
>> >>
>> >> In my case it always seems to be a skb with 1 frag which is broken down in
>> 5
>> >> or 6 pieces ..
>> >>
>> >> So "get_next_rx_buffer()" is called once .. and i'm overrunning the ring
>> with
>> >> 1 slot, but i'm not sure if that's not coincedence
>> >> since in the code there seem to be no explicit limitation on how often this
>> >> code path is taken. So perhaps it's implicitly limited
>> >> since packets and frags can't be arbitrarily large in comparison with the
>> >> page_size but that's not something i'm capable of figuring out :-)
>> >>
>> >>
>> >>
>> >> >   Paul
>> >>
>> >> >> Problem is that a worst case calculation would probably be reverting to
>> >> the
>> >> >> old calculation,
>> >> >> and the problems this patch was trying to solve would reappear, but
>> >> >> introducing new regressions
>> >> >> isn't very useful either. And since it seems such a tricky and fragile
>> thing to
>> >> >> determine, it would
>> >> >> probably be best to be split into a distinct function with a comment to
>> >> explain
>> >> >> the rationale used.
>> >> >>
>> >> >> Since this doesn't seem to progress very fast .. CC'ed some more folks
>> ..
>> >> you
>> >> >> never know ..
>> >> >>
>> >> >> --
>> >> >> Sander
>> >> >>
>> >> >>
>> >> >> Tuesday, March 25, 2014, 4:29:42 PM, you wrote:
>> >> >>
>> >> >>
>> >> >> > Tuesday, March 25, 2014, 4:15:39 PM, you wrote:
>> >> >>
>> >> >> >> On Sat, Mar 22, 2014 at 07:28:34PM +0100, Sander Eikelenboom
>> wrote:
>> >> >> >> [...]
>> >> >> >>> > Yes there is only one frag .. but it seems to be much larger than
>> >> >> PAGE_SIZE .. and xenvif_gop_frag_copy brakes that frag down into
>> >> smaller
>> >> >> bits .. hence the calculation in xenvif_rx_action determining the slots
>> >> needed
>> >> >> by doing:
>> >> >> >>>
>> >> >> >>> >                 for (i = 0; i < nr_frags; i++) {
>> >> >> >>> >                         unsigned int size;
>> >> >> >>> >                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
>> >> >> >>> >                         max_slots_needed += DIV_ROUND_UP(size,
>> >> PAGE_SIZE);
>> >> >> >>> >                 }
>> >> >> >>>
>> >> >> >>> > But the code in xenvif_gop_frag_copy .. seems to be needing
>> one
>> >> >> more slot (from the emperical test) .. and calling "get_next_rx_buffer"
>> >> >> seems involved in that ..
>> >> >> >>>
>> >> >> >>> Hmm looked again .. and it seems this is it .. when your frags are
>> large
>> >> >> enough you have the chance of running into this.
>> >> >> >>>
>> >> >>
>> >> >> >> get_next_rx_buffer is guarded by start_new_rx_buffer. Do you
>> see
>> >> any
>> >> >> >> problem with that implementation?
>> >> >> > In general no, but "get_next_rx_buffer" up's cons .. and the
>> calculations
>> >> >> done in "xenvif_rx_action" for max_slots_needed to prevent the
>> overrun
>> >> >> > don't count in this possibility. So it's not the guarding of
>> >> >> "start_new_rx_buffer" that is at fault. It's the ones early in
>> >> >> "xenvif_rx_action".
>> >> >> > The ones that were changed by Paul's patch from MAX_SKB_FRAGS
>> to a
>> >> >> calculated value that should be a "slim fit".
>> >> >>
>> >> >> > The problem is in determining upfront in "xenvif_rx_action" when
>> and
>> >> how
>> >> >> often the "get_next_rx_buffer" path will be taken.
>> >> >> > Unless there are other non direct restrictions (from a size point of
>> view)
>> >> it
>> >> >> can be called multiple times per frag per skb.
>> >> >>
>> >> >> >>> Problem is .. i don't see an easy fix, the "one more slot" of the
>> >> empirical
>> >> >> test doesn't seem to be the worst case either (i think):
>> >> >> >>>
>> >> >> >>> - In my case the packets that hit this only have 1 frag, but i could
>> have
>> >> >> had more frags.
>> >> >> >>>   I also think you can't rule out the possibility of doing the
>> >> >> "get_next_rx_buffer" for multiple subsequent frags from one packet,
>> >> >> >>>   so in the worst (and perhaps even from a single frag since it's
>> looped
>> >> >> over a split of it in what seems PAGE_SIZE pieces.)
>> >> >> >>>   - So an exact calculation of how much slots we are going to need
>> for
>> >> >> hitting this "get_next_rx_buffer"  upfront in "xenvif_rx_action" seems
>> >> >> unfeasible.
>> >> >> >>>   - A worst case gamble seems impossible either .. if you take
>> multiple
>> >> >> frags * multiple times the "get_next_rx_buffer" ... you would probably
>> be
>> >> >> back at just
>> >> >> >>>     setting the needed_slots to MAX_SKB_FRAGS.
>> >> >> >>>
>> >> >> >>> - Other thing would be checking for the available slots before
>> doing
>> >> the
>> >> >> "get_next_rx_buffer" .. how ever .. we don't account for how many
>> slots
>> >> we
>> >> >> still need to
>> >> >> >>>   just process the remaining frags.
>> >> >> >>>
>> >> >>
>> >> >> >> We've done a worst case estimation for whole SKB (linear area + all
>> >> >> >> frags) already, at the very first beginning. That's what
>> >> >> >> max_slots_needed is for.
>> >> >>
>> >> >> >>> - Just remove the whole "get_next_rx_buffer".. just tested it but
>> it
>> >> >> seems the "get_next_rx_buffer" is necessary ..  when i make
>> >> >> start_new_rx_buffer always return false
>> >> >> >>>   i hit the bug below :S
>> >> >> >>>
>> >> >> >>> So the questions are ... is the "get_next_rx_buffer" part really
>> >> necessary
>> >> >> ?
>> >> >> >>>    - if not, what is the benefit of the "get_next_rx_buffer" and
>> does
>> >> that
>> >> >> outweigh the additional cost of accounting
>> >> >> >>>      of needed_slots for the frags that have yet to be processed ?
>> >> >> >>>    - if yes, erhmmm what would be the best acceptable solution ..
>> >> >> returning to using MAX_SKB_FRAGS ?
>> >> >> >>>
>> >> >>
>> >> >> >> I think you need to answer several questions first:
>> >> >> >> 1. is max_slots_needed actually large enough to cover whole SKB?
>> >> >> >         No it's not if, you end up calling "get_next_rx_buffer" one or
>> >> multiple
>> >> >> times when processing the SKB
>> >> >> >         you have the chance of overrunning (or be saved because prod
>> gets
>> >> >> upped before you overrun).
>> >> >>
>> >> >> >> 2. is the test of ring slot availability acurate?
>> >> >> >         Seems to be.
>> >> >>
>> >> >> >> 3. is the number of ring slots consumed larger than
>> >> max_slots_needed? (I
>> >> >> >>    guess the answer is yes)
>> >> >> >         Yes that was the whole point.
>> >> >>
>> >> >> >> 4. which step in the break-down procedure causes backend to
>> overrun
>> >> >> the
>> >> >> >>    ring?
>> >> >> >         The "get_next_rx_buffer" call .. that is not accounted for (which
>> can
>> >> be
>> >> >> called
>> >> >> >         multiple times per frag (unless some other none obvious size
>> >> >> restriction limits this
>> >> >> >         to one time per frag or one time per SKB).
>> >> >> >         In my errorneous case it is called one time, but it would be nice if
>> >> there
>> >> >> would be some theoretical proof
>> >> >> >         instead of just some emperical test.
>> >> >>
>> >> >>
>> >> >> >> It doesn't matter how many frags your SKB has and how big a frag
>> is. If
>> >> >> >> every step is correct then you're fine. The code you're looking at
>> >> >> >> (start_new_rx_buffer / get_next_rx_buffer and friend) needs to
>> be
>> >> >> >> carefully examined.
>> >> >>
>> >> >> > Well it seems it only calls "get_next_rx_buffer" in some special cases
>> ..
>> >> >> (and that also what i'm seeing because it doesn't happen
>> >> >> > continously.
>> >> >>
>> >> >> > Question is shouldn't this max_needed_slots calc be reverted to
>> what it
>> >> >> was before 3.14 and take the time in 3.15 to figure out a
>> >> >> > the theoretical max (if it can be calculated upfront) .. or another way
>> to
>> >> >> guarantee the code is able to process the whole SKB  ?
>> >> >>
>> >> >> >> Wei.
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >>
>> >>
>> 
>> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 16:53                                   ` Sander Eikelenboom
@ 2014-03-26 17:16                                     ` Paul Durrant
  2014-03-26 17:33                                       ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Paul Durrant @ 2014-03-26 17:16 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 26 March 2014 16:54
> To: Paul Durrant
> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
> kernel; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> troubles "bisected"
> 
> 
> Wednesday, March 26, 2014, 5:25:21 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 26 March 2014 16:07
> >> To: Paul Durrant
> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> linux-
> >> kernel; netdev@vger.kernel.org
> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> >> troubles "bisected"
> >>
> >>
> >> Wednesday, March 26, 2014, 4:50:30 PM, you wrote:
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 26 March 2014 15:23
> >> >> To: Paul Durrant
> >> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> >> linux-
> >> >> kernel; netdev@vger.kernel.org
> >> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> >> >> troubles "bisected"
> >> >>
> >> >>
> >> >> Wednesday, March 26, 2014, 3:44:42 PM, you wrote:
> >> >>
> >> >> >> -----Original Message-----
> >> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> >> Sent: 26 March 2014 11:11
> >> >> >> To: Paul Durrant
> >> >> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian
> Campbell;
> >> >> linux-
> >> >> >> kernel; netdev@vger.kernel.org
> >> >> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13
> Network
> >> >> >> troubles "bisected"
> >> >> >>
> >> >> >> Paul,
> >> >> >>
> >> >> >> You have been awfully silent for this whole thread while this is a
> >> >> regression
> >> >> >> caused by a patch of you
> >> >> >> (ca2f09f2b2c6c25047cfc545d057c4edfcfe561c as clearly stated much
> >> earlier
> >> >> in
> >> >> >> this thread).
> >> >> >>
> >> >>
> >> >> > Sorry, I've been distracted...
> >> >>
> >> >> >> The commit messages states:
> >> >> >>     "net_rx_action() is the place where we could do with an accurate
> >> >> >> predicition but,
> >> >> >>     since that has proven tricky to calculate, a cheap worse-case (but
> not
> >> >> too
> >> >> >> bad)
> >> >> >>     estimate is all we really need since the only thing we *must*
> prevent
> >> is
> >> >> >> xenvif_gop_skb()
> >> >> >>     consuming more slots than are available."
> >> >> >>
> >> >> >> Your "worst-case" calculation stated in the commit message is
> clearly
> >> not
> >> >> the
> >> >> >> worst case,
> >> >> >> since it doesn't take calls to "get_next_rx_buffer" into account.
> >> >> >>
> >> >>
> >> >> > It should be taking into account the behaviour of
> >> start_new_rx_buffer(),
> >> >> which should be true if a slot is full or a frag will overflow the current
> slot
> >> and
> >> >> doesn't require splitting.
> >> >> > The code in net_rx_action() makes the assumption that each frag will
> >> >> require as many slots as its size requires, i.e. it assumes no packing of
> >> >> multiple frags into a single slot, so it should be a worst case.
> >> >> > Did I miss something in that logic?
> >> >>
> >> >> Yes.
> >> >> In "xenvif_gop_skb()" this loop:
> >> >>
> >> >>         for (i = 0; i < nr_frags; i++) {
> >> >>                 xenvif_gop_frag_copy(vif, skb, npo,
> >> >>                                      skb_frag_page(&skb_shinfo(skb)->frags[i]),
> >> >>                                      skb_frag_size(&skb_shinfo(skb)->frags[i]),
> >> >>                                      skb_shinfo(skb)->frags[i].page_offset,
> >> >>                                      &head);
> >> >>         }
> >> >>
> >> >> Is capable of using up (at least) 1 slot more than is anticipated for in
> >> >> "net_rx_action()"  by this code:
> >> >>
> >> >>                 for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
> >> >>                         unsigned int size;
> >> >>                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
> >> >>                         max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
> >> >>                 }
> >> >>
> >> >> And this happens when it calls "get_next_rx_buffer()" from
> >> >> "xenvif_gop_frag_copy()" where it's breaking down the frag.
> >> >>
> >>
> >> > The function that determines whether to consume another slot is
> >> start_new_rx_buffer() and for each frag I don't see why this would return
> >> true more than DIV_ROUND_UP(size, PAGE_SIZE) times.
> >> > It may be called more times than that since the code in
> >> xenvif_gop_frag_copy() must also allow for the offset of the frag but
> should
> >> not return true in all cases.
> >> > So, I still cannot see why a frag would ever consume more than
> >> DIV_ROUND_UP(size, PAGE_SIZE) slots.
> >>
> >> Well here a case were a frag is broken down in 2 pieces:
> >>
> >> [ 1156.870372] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here 1  npo-
> >> >meta_prod:39 vif->rx.sring->req_prod:2105867 vif-
> >rx.req_cons:2105867
> >> npo->copy_gref:760  npo->copy_off:4096  MAX_BUFFER_OFFSET:4096
> >> bytes:560 size:560  offset:0 head:1273462060 i:2 vif->rx.sring-
> >> >req_event:2104275 estimated_slots_needed:4 reserved_slots_left:0
> >> [ 1156.871971] vif vif-7-0 vif7.0: ?!? xenvif_start_xmit stopping queue !
> >> min_slots_needed:1 min_slots_needed_2:0 min_slots_needed_3:0 vif-
> >> >rx.sring->req_prod:2105867 vif->rx.req_cons:2105867 vif->rx.sring-
> >> >req_event:2105868 skb->len:66 skb->data_len:0
> >> [ 1156.964316] vif vif-7-0 vif7.0: ?!? get_next_rx_buffer before req npo-
> >> >meta_prod:39 vif->rx.sring->req_prod:2105867 vif-
> >rx.req_cons:2105867
> >> vif->rx.sring->req_event:2105868, reserved_slots_left:0
> >> [ 1157.001635] vif vif-7-0 vif7.0: ?!? get_next_rx_buffer after req npo-
> >> >meta_prod:39 vif->rx.sring->req_prod:2105867 vif-
> >rx.req_cons:2105868
> >> req->gref:4325379 req->id:11 vif->rx.sring->req_event:2105868
> >> reserved_slots_left:-1
> >> [ 1157.039095] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here 2  npo-
> >> >meta_prod:40 vif->rx.sring->req_prod:2105867 vif-
> >rx.req_cons:2105868
> >> npo->copy_gref:4325379  npo->copy_off:0  MAX_BUFFER_OFFSET:4096
> >> bytes:560 size:560  offset:0 head:1273462060 i:2 vif->rx.sring-
> >> >req_event:2105868 estimated_slots_needed:4 reserved_slots_left:-1
> >> [ 1157.095216] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here end
> npo-
> >> >meta_prod:40 vif->rx.sring->req_prod:2105867 vif-
> >rx.req_cons:2105868
> >> npo->copy_gref:4325379 npo->copy_off:560  MAX_BUFFER_OFFSET:4096
> >> bytes:560 size:0  offset:560 head:1273462060 i:3 vif->rx.sring-
> >> >req_event:2105868 gso_gaps:0 estimated_slots_needed:4
> >> reserved_slots_left:-1
> >> [ 1157.151338] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 4 after npo-
> >> >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
> >> >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
> >> req->gref:657 req->id:7 estimated_slots_needed:4 i(frag):0 j(data):1
> >> reserved_slots_left:-1
> >> [ 1157.188908] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 5 npo-
> >> >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
> >> >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
> >> req->gref:657 req->id:7 estimated_slots_needed:4 j(data):1
> >> reserved_slots_left:-1    used in funcstart: 0 + 1 .. used_dataloop:1 ..
> >> used_fragloop:3
> >> [ 1157.244975] vif vif-7-0 vif7.0: ?!? xenvif_rx_action me here 2 ..  vif-
> >> >rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 sco-
> >> >meta_slots_used:4 max_upped_gso:1 skb_is_gso(skb):1
> >> max_slots_needed:4 j:6 is_gso:1 nr_frags:1 firstpart:1 secondpart:2
> >> reserved_slots_left:-1
> >>
> >> - When processing an SKB we end up in "xenvif_gop_frag_copy" while
> prod
> >> == cons ... but we still have bytes and size left ..
> >> - start_new_rx_buffer() has returned true ..
> >> - so we end up in get_next_rx_buffer
> >> - this does a RING_GET_REQUEST and ups cons ..
> >> - and we end up with a bad grant reference.
> >>
> >> Sometimes we are saved by the bell .. since additional slots have become
> >> free (you see cons become > prod in "get_next_rx_buffer" but shortly
> after
> >> that prod is increased ..
> >> just in time to not cause a overrun).
> >>
> 
> > Ah, but hang on... There's a BUG_ON meta_slots_used >
> max_slots_needed, so if we are overflowing the worst-case calculation then
> why is that BUG_ON not firing?
> 
> You mean:
>                 sco = (struct skb_cb_overlay *)skb->cb;
>                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
>                 BUG_ON(sco->meta_slots_used > max_slots_needed);
> 
> in "get_next_rx_buffer" ?
> 

That code excerpt is from net_rx_action(),isn't it?

> I don't know .. at least now it doesn't crash dom0 and therefore not my
> complete machine and since tcp is recovering from a failed packet  :-)
> 

Well, if the code calculating max_slots_needed were underestimating then the BUG_ON() should fire. If it is not firing in your case then this suggests your problem lies elsewhere, or that meta_slots_used is not equal to the number of ring slots consumed.

> But probably because "npo->copy_prod++" seems to be used for the frags ..
> and it isn't added to  npo->meta_prod ?
> 

meta_slots_used is calculated as the value of meta_prod at return (from xenvif_gop_skb()) minus the value on entry , and if you look back up the code then you can see that meta_prod is incremented every time RING_GET_REQUEST() is evaluated. So, we must be consuming a slot without evaluating RING_GET_REQUEST() and I think that's exactly what's happening... Right at the bottom of xenvif_gop_frag_copy() req_cons is simply incremented in the case of a GSO. So the BUG_ON() is indeed off by one.

  Paul

> --
> Sander
> 
> >   Paul
> 
> >> If you need additional / other info, please cook up a debug patch with
> what
> >> you need.
> >>
> >> --
> >> Sander
> >>
> >>
> >>
> >>
> >>
> >>
> >> >   Paul
> >>
> >> >> Ultimately this results in bad grant reference warnings (and packets
> >> marked
> >> >> as "errors" in the interface statistics).
> >> >>
> >> >> In my case it always seems to be a skb with 1 frag which is broken
> down in
> >> 5
> >> >> or 6 pieces ..
> >> >>
> >> >> So "get_next_rx_buffer()" is called once .. and i'm overrunning the ring
> >> with
> >> >> 1 slot, but i'm not sure if that's not coincedence
> >> >> since in the code there seem to be no explicit limitation on how often
> this
> >> >> code path is taken. So perhaps it's implicitly limited
> >> >> since packets and frags can't be arbitrarily large in comparison with the
> >> >> page_size but that's not something i'm capable of figuring out :-)
> >> >>
> >> >>
> >> >>
> >> >> >   Paul
> >> >>
> >> >> >> Problem is that a worst case calculation would probably be reverting
> to
> >> >> the
> >> >> >> old calculation,
> >> >> >> and the problems this patch was trying to solve would reappear,
> but
> >> >> >> introducing new regressions
> >> >> >> isn't very useful either. And since it seems such a tricky and fragile
> >> thing to
> >> >> >> determine, it would
> >> >> >> probably be best to be split into a distinct function with a comment
> to
> >> >> explain
> >> >> >> the rationale used.
> >> >> >>
> >> >> >> Since this doesn't seem to progress very fast .. CC'ed some more
> folks
> >> ..
> >> >> you
> >> >> >> never know ..
> >> >> >>
> >> >> >> --
> >> >> >> Sander
> >> >> >>
> >> >> >>
> >> >> >> Tuesday, March 25, 2014, 4:29:42 PM, you wrote:
> >> >> >>
> >> >> >>
> >> >> >> > Tuesday, March 25, 2014, 4:15:39 PM, you wrote:
> >> >> >>
> >> >> >> >> On Sat, Mar 22, 2014 at 07:28:34PM +0100, Sander Eikelenboom
> >> wrote:
> >> >> >> >> [...]
> >> >> >> >>> > Yes there is only one frag .. but it seems to be much larger
> than
> >> >> >> PAGE_SIZE .. and xenvif_gop_frag_copy brakes that frag down into
> >> >> smaller
> >> >> >> bits .. hence the calculation in xenvif_rx_action determining the
> slots
> >> >> needed
> >> >> >> by doing:
> >> >> >> >>>
> >> >> >> >>> >                 for (i = 0; i < nr_frags; i++) {
> >> >> >> >>> >                         unsigned int size;
> >> >> >> >>> >                         size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
> >> >> >> >>> >                         max_slots_needed += DIV_ROUND_UP(size,
> >> >> PAGE_SIZE);
> >> >> >> >>> >                 }
> >> >> >> >>>
> >> >> >> >>> > But the code in xenvif_gop_frag_copy .. seems to be needing
> >> one
> >> >> >> more slot (from the emperical test) .. and calling
> "get_next_rx_buffer"
> >> >> >> seems involved in that ..
> >> >> >> >>>
> >> >> >> >>> Hmm looked again .. and it seems this is it .. when your frags
> are
> >> large
> >> >> >> enough you have the chance of running into this.
> >> >> >> >>>
> >> >> >>
> >> >> >> >> get_next_rx_buffer is guarded by start_new_rx_buffer. Do you
> >> see
> >> >> any
> >> >> >> >> problem with that implementation?
> >> >> >> > In general no, but "get_next_rx_buffer" up's cons .. and the
> >> calculations
> >> >> >> done in "xenvif_rx_action" for max_slots_needed to prevent the
> >> overrun
> >> >> >> > don't count in this possibility. So it's not the guarding of
> >> >> >> "start_new_rx_buffer" that is at fault. It's the ones early in
> >> >> >> "xenvif_rx_action".
> >> >> >> > The ones that were changed by Paul's patch from
> MAX_SKB_FRAGS
> >> to a
> >> >> >> calculated value that should be a "slim fit".
> >> >> >>
> >> >> >> > The problem is in determining upfront in "xenvif_rx_action" when
> >> and
> >> >> how
> >> >> >> often the "get_next_rx_buffer" path will be taken.
> >> >> >> > Unless there are other non direct restrictions (from a size point of
> >> view)
> >> >> it
> >> >> >> can be called multiple times per frag per skb.
> >> >> >>
> >> >> >> >>> Problem is .. i don't see an easy fix, the "one more slot" of the
> >> >> empirical
> >> >> >> test doesn't seem to be the worst case either (i think):
> >> >> >> >>>
> >> >> >> >>> - In my case the packets that hit this only have 1 frag, but i
> could
> >> have
> >> >> >> had more frags.
> >> >> >> >>>   I also think you can't rule out the possibility of doing the
> >> >> >> "get_next_rx_buffer" for multiple subsequent frags from one
> packet,
> >> >> >> >>>   so in the worst (and perhaps even from a single frag since it's
> >> looped
> >> >> >> over a split of it in what seems PAGE_SIZE pieces.)
> >> >> >> >>>   - So an exact calculation of how much slots we are going to
> need
> >> for
> >> >> >> hitting this "get_next_rx_buffer"  upfront in "xenvif_rx_action"
> seems
> >> >> >> unfeasible.
> >> >> >> >>>   - A worst case gamble seems impossible either .. if you take
> >> multiple
> >> >> >> frags * multiple times the "get_next_rx_buffer" ... you would
> probably
> >> be
> >> >> >> back at just
> >> >> >> >>>     setting the needed_slots to MAX_SKB_FRAGS.
> >> >> >> >>>
> >> >> >> >>> - Other thing would be checking for the available slots before
> >> doing
> >> >> the
> >> >> >> "get_next_rx_buffer" .. how ever .. we don't account for how
> many
> >> slots
> >> >> we
> >> >> >> still need to
> >> >> >> >>>   just process the remaining frags.
> >> >> >> >>>
> >> >> >>
> >> >> >> >> We've done a worst case estimation for whole SKB (linear area +
> all
> >> >> >> >> frags) already, at the very first beginning. That's what
> >> >> >> >> max_slots_needed is for.
> >> >> >>
> >> >> >> >>> - Just remove the whole "get_next_rx_buffer".. just tested it
> but
> >> it
> >> >> >> seems the "get_next_rx_buffer" is necessary ..  when i make
> >> >> >> start_new_rx_buffer always return false
> >> >> >> >>>   i hit the bug below :S
> >> >> >> >>>
> >> >> >> >>> So the questions are ... is the "get_next_rx_buffer" part really
> >> >> necessary
> >> >> >> ?
> >> >> >> >>>    - if not, what is the benefit of the "get_next_rx_buffer" and
> >> does
> >> >> that
> >> >> >> outweigh the additional cost of accounting
> >> >> >> >>>      of needed_slots for the frags that have yet to be processed
> ?
> >> >> >> >>>    - if yes, erhmmm what would be the best acceptable solution
> ..
> >> >> >> returning to using MAX_SKB_FRAGS ?
> >> >> >> >>>
> >> >> >>
> >> >> >> >> I think you need to answer several questions first:
> >> >> >> >> 1. is max_slots_needed actually large enough to cover whole
> SKB?
> >> >> >> >         No it's not if, you end up calling "get_next_rx_buffer" one or
> >> >> multiple
> >> >> >> times when processing the SKB
> >> >> >> >         you have the chance of overrunning (or be saved because
> prod
> >> gets
> >> >> >> upped before you overrun).
> >> >> >>
> >> >> >> >> 2. is the test of ring slot availability acurate?
> >> >> >> >         Seems to be.
> >> >> >>
> >> >> >> >> 3. is the number of ring slots consumed larger than
> >> >> max_slots_needed? (I
> >> >> >> >>    guess the answer is yes)
> >> >> >> >         Yes that was the whole point.
> >> >> >>
> >> >> >> >> 4. which step in the break-down procedure causes backend to
> >> overrun
> >> >> >> the
> >> >> >> >>    ring?
> >> >> >> >         The "get_next_rx_buffer" call .. that is not accounted for
> (which
> >> can
> >> >> be
> >> >> >> called
> >> >> >> >         multiple times per frag (unless some other none obvious size
> >> >> >> restriction limits this
> >> >> >> >         to one time per frag or one time per SKB).
> >> >> >> >         In my errorneous case it is called one time, but it would be
> nice if
> >> >> there
> >> >> >> would be some theoretical proof
> >> >> >> >         instead of just some emperical test.
> >> >> >>
> >> >> >>
> >> >> >> >> It doesn't matter how many frags your SKB has and how big a
> frag
> >> is. If
> >> >> >> >> every step is correct then you're fine. The code you're looking at
> >> >> >> >> (start_new_rx_buffer / get_next_rx_buffer and friend) needs
> to
> >> be
> >> >> >> >> carefully examined.
> >> >> >>
> >> >> >> > Well it seems it only calls "get_next_rx_buffer" in some special
> cases
> >> ..
> >> >> >> (and that also what i'm seeing because it doesn't happen
> >> >> >> > continously.
> >> >> >>
> >> >> >> > Question is shouldn't this max_needed_slots calc be reverted to
> >> what it
> >> >> >> was before 3.14 and take the time in 3.15 to figure out a
> >> >> >> > the theoretical max (if it can be calculated upfront) .. or another
> way
> >> to
> >> >> >> guarantee the code is able to process the whole SKB  ?
> >> >> >>
> >> >> >> >> Wei.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >>
> >> >>
> >>
> >>
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 17:16                                     ` Paul Durrant
@ 2014-03-26 17:33                                       ` Sander Eikelenboom
  2014-03-26 17:46                                         ` Paul Durrant
  2014-03-26 17:48                                         ` Paul Durrant
  0 siblings, 2 replies; 20+ messages in thread
From: Sander Eikelenboom @ 2014-03-26 17:33 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org


Hi Paul,

Seems your last mail arrived in pretty bad shape (truncated) in my mailbox ..

--
Sander

Wednesday, March 26, 2014, 6:16:49 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 26 March 2014 16:54
>> To: Paul Durrant
>> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
>> kernel; netdev@vger.kernel.org
>> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> troubles "bisected"
>> 
>> 
>> Wednesday, March 26, 2014, 5:25:21 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 26 March 2014 16:07
>> >> To: Paul Durrant
>> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> linux-
>> >> kernel; netdev@vger.kernel.org
>> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> >> troubles "bisected"
>> >>
>> >>
>> >> Wednesday, March 26, 2014, 4:50:30 PM, you wrote:
>> >>
>> >> >> -----Original Message-----
>> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> >> Sent: 26 March 2014 15:23
>> >> >> To: Paul Durrant
>> >> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> >> linux-
>> >> >> kernel; netdev@vger.kernel.org
>> >> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> >> >> troubles "bisected"
>> >> >>
>> >> >>
>> >> >> Wednesday, March 26, 2014, 3:44:42 PM, you wrote:
>> >> >>
>> >> >> >> -----Original Message-----
>> >> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> >> >> Sent: 26 March 2014 11:11
>> >> >> >> To: Paul Durrant
>> >> >> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian
>> Campbell;
>> >> >> linux-
>> >> >> >> kernel; netdev@vger.kernel.org
>> >> >> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13
>> Network
>> >> >> >> troubles "bisected"
>> >> >> >>
>> >> >> >> Paul,
>> >> >> >>
>> >> >> >> You have been awfully silent for this whole thread while this is a
>> >> >> regression
>> >> >> >> caused by a patch of you
>> >> >> >> (ca2f09f2b2c6c25047cfc545d057c4edfcfe561c as clearly stated much
>> >> earlier
>> >> >> in
>> >> >> >> this thread).
>> >> >> >>
>> >> >>
>> >> >> > Sorry, I've been distracted...
>> >> >>
>> >> >> >> The commit messages states:
>> >> >> >>     "net_rx_action() is the place where we could do with an accurate
>> >> >> >> predicition but,
>> >> >> >>     since that has proven tricky to calculate, a cheap worse-case (but
>> not
>> >> >> too
>> >> >> >> bad)
>> >> >> >>     estimate is all we really need since the only thing we *must*
>> prevent
>> >> is
>> >> >> >> xenvif_gop_skb()
>> >> >> >>     consuming more slots than are available."
>> >> >> >>
>> >> >> >> Your "worst-case" calculation stated in the commit message is
>> clearly
>> >> not
>> >> >> the
>> >> >> >> worst case,
>> >> >> >> since it doesn't take calls to "get_next_rx_buffer" into account

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 17:33                                       ` Sander Eikelenboom
@ 2014-03-26 17:46                                         ` Paul Durrant
  2014-03-26 18:07                                           ` Sander Eikelenboom
  2014-03-26 17:48                                         ` Paul Durrant
  1 sibling, 1 reply; 20+ messages in thread
From: Paul Durrant @ 2014-03-26 17:46 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

Re-send shortened version...

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 26 March 2014 16:54
> To: Paul Durrant
> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
> kernel; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> troubles "bisected"
> 
[snip]
> >>
> >> - When processing an SKB we end up in "xenvif_gop_frag_copy" while
> prod
> >> == cons ... but we still have bytes and size left ..
> >> - start_new_rx_buffer() has returned true ..
> >> - so we end up in get_next_rx_buffer
> >> - this does a RING_GET_REQUEST and ups cons ..
> >> - and we end up with a bad grant reference.
> >>
> >> Sometimes we are saved by the bell .. since additional slots have become
> >> free (you see cons become > prod in "get_next_rx_buffer" but shortly
> after
> >> that prod is increased ..
> >> just in time to not cause a overrun).
> >>
> 
> > Ah, but hang on... There's a BUG_ON meta_slots_used >
> max_slots_needed, so if we are overflowing the worst-case calculation then
> why is that BUG_ON not firing?
> 
> You mean:
>                 sco = (struct skb_cb_overlay *)skb->cb;
>                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
>                 BUG_ON(sco->meta_slots_used > max_slots_needed);
> 
> in "get_next_rx_buffer" ?
> 

That code excerpt is from net_rx_action(),isn't it?

> I don't know .. at least now it doesn't crash dom0 and therefore not my
> complete machine and since tcp is recovering from a failed packet  :-)
> 

Well, if the code calculating max_slots_needed were underestimating then the BUG_ON() should fire. If it is not firing in your case then this suggests your problem lies elsewhere, or that meta_slots_used is not equal to the number of ring slots consumed.

> But probably because "npo->copy_prod++" seems to be used for the frags ..
> and it isn't added to  npo->meta_prod ?
> 

meta_slots_used is calculated as the value of meta_prod at return (from xenvif_gop_skb()) minus the value on entry , and if you look back up the code then you can see that meta_prod is incremented every time RING_GET_REQUEST() is evaluated. So, we must be consuming a slot without evaluating RING_GET_REQUEST() and I think that's exactly what's happening... Right at the bottom of xenvif_gop_frag_copy() req_cons is simply incremented in the case of a GSO. So the BUG_ON() is indeed off by one.

  Paul

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 17:33                                       ` Sander Eikelenboom
  2014-03-26 17:46                                         ` Paul Durrant
@ 2014-03-26 17:48                                         ` Paul Durrant
  2014-03-26 19:57                                           ` Sander Eikelenboom
  1 sibling, 1 reply; 20+ messages in thread
From: Paul Durrant @ 2014-03-26 17:48 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

> -----Original Message-----
> From: Paul Durrant
> Sent: 26 March 2014 17:47
> To: 'Sander Eikelenboom'
> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
> kernel; netdev@vger.kernel.org
> Subject: RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> troubles "bisected"
> 
> Re-send shortened version...
> 
> > -----Original Message-----
> > From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> > Sent: 26 March 2014 16:54
> > To: Paul Durrant
> > Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> linux-
> > kernel; netdev@vger.kernel.org
> > Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> > troubles "bisected"
> >
> [snip]
> > >>
> > >> - When processing an SKB we end up in "xenvif_gop_frag_copy" while
> > prod
> > >> == cons ... but we still have bytes and size left ..
> > >> - start_new_rx_buffer() has returned true ..
> > >> - so we end up in get_next_rx_buffer
> > >> - this does a RING_GET_REQUEST and ups cons ..
> > >> - and we end up with a bad grant reference.
> > >>
> > >> Sometimes we are saved by the bell .. since additional slots have
> become
> > >> free (you see cons become > prod in "get_next_rx_buffer" but shortly
> > after
> > >> that prod is increased ..
> > >> just in time to not cause a overrun).
> > >>
> >
> > > Ah, but hang on... There's a BUG_ON meta_slots_used >
> > max_slots_needed, so if we are overflowing the worst-case calculation
> then
> > why is that BUG_ON not firing?
> >
> > You mean:
> >                 sco = (struct skb_cb_overlay *)skb->cb;
> >                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
> >                 BUG_ON(sco->meta_slots_used > max_slots_needed);
> >
> > in "get_next_rx_buffer" ?
> >
> 
> That code excerpt is from net_rx_action(),isn't it?
> 
> > I don't know .. at least now it doesn't crash dom0 and therefore not my
> > complete machine and since tcp is recovering from a failed packet  :-)
> >
> 
> Well, if the code calculating max_slots_needed were underestimating then
> the BUG_ON() should fire. If it is not firing in your case then this suggests
> your problem lies elsewhere, or that meta_slots_used is not equal to the
> number of ring slots consumed.
> 
> > But probably because "npo->copy_prod++" seems to be used for the frags
> ..
> > and it isn't added to  npo->meta_prod ?
> >
> 
> meta_slots_used is calculated as the value of meta_prod at return (from
> xenvif_gop_skb()) minus the value on entry , and if you look back up the
> code then you can see that meta_prod is incremented every time
> RING_GET_REQUEST() is evaluated. So, we must be consuming a slot without
> evaluating RING_GET_REQUEST() and I think that's exactly what's
> happening... Right at the bottom of xenvif_gop_frag_copy() req_cons is
> simply incremented in the case of a GSO. So the BUG_ON() is indeed off by
> one.
> 

Can you re-test with the following patch applied?

  Paul

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback
index 438d0c0..4f24220 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -482,6 +482,8 @@ static void xenvif_rx_action(struct xenvif *vif)

        while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) {
                RING_IDX max_slots_needed;
+               RING_IDX old_req_cons;
+               RING_IDX ring_slots_used;
                int i;

                /* We need a cheap worse case estimate for the number of
@@ -511,8 +513,12 @@ static void xenvif_rx_action(struct xenvif *vif)
                        vif->rx_last_skb_slots = 0;

                sco = (struct skb_cb_overlay *)skb->cb;
+
+               old_req_cons = vif->rx.req_cons;
                sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
-               BUG_ON(sco->meta_slots_used > max_slots_needed);
+               ring_slots_used = vif->rx.req_cons - old_req_cons;
+
+               BUG_ON(ring_slots_used > max_slots_needed);

                __skb_queue_tail(&rxq, skb);
        }

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 17:46                                         ` Paul Durrant
@ 2014-03-26 18:07                                           ` Sander Eikelenboom
  2014-03-26 18:15                                             ` Paul Durrant
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2014-03-26 18:07 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org


Wednesday, March 26, 2014, 6:46:06 PM, you wrote:

> Re-send shortened version...

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 26 March 2014 16:54
>> To: Paul Durrant
>> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
>> kernel; netdev@vger.kernel.org
>> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> troubles "bisected"
>> 
> [snip]
>> >>
>> >> - When processing an SKB we end up in "xenvif_gop_frag_copy" while
>> prod
>> >> == cons ... but we still have bytes and size left ..
>> >> - start_new_rx_buffer() has returned true ..
>> >> - so we end up in get_next_rx_buffer
>> >> - this does a RING_GET_REQUEST and ups cons ..
>> >> - and we end up with a bad grant reference.
>> >>
>> >> Sometimes we are saved by the bell .. since additional slots have become
>> >> free (you see cons become > prod in "get_next_rx_buffer" but shortly
>> after
>> >> that prod is increased ..
>> >> just in time to not cause a overrun).
>> >>
>> 
>> > Ah, but hang on... There's a BUG_ON meta_slots_used >
>> max_slots_needed, so if we are overflowing the worst-case calculation then
>> why is that BUG_ON not firing?
>> 
>> You mean:
>>                 sco = (struct skb_cb_overlay *)skb->cb;
>>                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
>>                 BUG_ON(sco->meta_slots_used > max_slots_needed);
>> 
>> in "get_next_rx_buffer" ?
>> 

> That code excerpt is from net_rx_action(),isn't it?

Yes

>> I don't know .. at least now it doesn't crash dom0 and therefore not my
>> complete machine and since tcp is recovering from a failed packet  :-)
>> 

> Well, if the code calculating max_slots_needed were underestimating then the BUG_ON() should fire. If it is not firing in your case then this suggests your problem lies elsewhere, or that meta_slots_used is not equal to the number of ring slots consumed.

It's seem to be the last ..

[ 1157.188908] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 5 npo->meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1 req->gref:657 req->id:7 estimated_slots_needed:4 j(data):1 reserved_slots_left:-1    used in funcstart: 0 + 1 .. used_dataloop:1 .. used_fragloop:3
[ 1157.244975] vif vif-7-0 vif7.0: ?!? xenvif_rx_action me here 2 ..  vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 sco->meta_slots_used:4 max_upped_gso:1 skb_is_gso(skb):1 max_slots_needed:4 j:6 is_gso:1 nr_frags:1 firstpart:1 secondpart:2 reserved_slots_left:-1

net_rx_action() calculated we would need 4 slots .. and sco->meta_slots_used == 4 when we return so it doesn't trigger you BUG_ON ..

The 4 slots we calculated are:
  1 slot for the data part: DIV_ROUND_UP(offset_in_page(skb->data) + skb_headlen(skb), PAGE_SIZE)
  2 slots for the single frag in this SKB from: DIV_ROUND_UP(size, PAGE_SIZE)
  1 slot since GSO

In the debug code i annotated all cons++, and the code uses 1 slot to process the data from the SKB as expected but uses 3 slots in the frag chopping loop.
And when it reaches the state  were cons > prod it is always in "get_next_rx_buffer".

>> But probably because "npo->copy_prod++" seems to be used for the frags ..
>> and it isn't added to  npo->meta_prod ?
>> 

> meta_slots_used is calculated as the value of meta_prod at return (from xenvif_gop_skb()) minus the value on entry ,
> and if you look back up the code then you can see that meta_prod is incremented every time RING_GET_REQUEST() is evaluated.
> So, we must be consuming a slot without evaluating RING_GET_REQUEST() and I think that's exactly what's happening...
> Right at the bottom of xenvif_gop_frag_copy() req_cons is simply incremented in the case of a GSO. So the BUG_ON() is indeed off by one.

That is probably only done on first iteration / frag ?
Because i don't see my warn there trigger .. but it could be that's because at that moment we still have cons <= prod.


>   Paul

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 18:07                                           ` Sander Eikelenboom
@ 2014-03-26 18:15                                             ` Paul Durrant
  2014-03-26 18:42                                               ` Paul Durrant
  2014-03-26 20:17                                               ` Sander Eikelenboom
  0 siblings, 2 replies; 20+ messages in thread
From: Paul Durrant @ 2014-03-26 18:15 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 26 March 2014 18:08
> To: Paul Durrant
> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
> kernel; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> troubles "bisected"
> 
> 
> Wednesday, March 26, 2014, 6:46:06 PM, you wrote:
> 
> > Re-send shortened version...
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 26 March 2014 16:54
> >> To: Paul Durrant
> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> linux-
> >> kernel; netdev@vger.kernel.org
> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> >> troubles "bisected"
> >>
> > [snip]
> >> >>
> >> >> - When processing an SKB we end up in "xenvif_gop_frag_copy" while
> >> prod
> >> >> == cons ... but we still have bytes and size left ..
> >> >> - start_new_rx_buffer() has returned true ..
> >> >> - so we end up in get_next_rx_buffer
> >> >> - this does a RING_GET_REQUEST and ups cons ..
> >> >> - and we end up with a bad grant reference.
> >> >>
> >> >> Sometimes we are saved by the bell .. since additional slots have
> become
> >> >> free (you see cons become > prod in "get_next_rx_buffer" but shortly
> >> after
> >> >> that prod is increased ..
> >> >> just in time to not cause a overrun).
> >> >>
> >>
> >> > Ah, but hang on... There's a BUG_ON meta_slots_used >
> >> max_slots_needed, so if we are overflowing the worst-case calculation
> then
> >> why is that BUG_ON not firing?
> >>
> >> You mean:
> >>                 sco = (struct skb_cb_overlay *)skb->cb;
> >>                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
> >>                 BUG_ON(sco->meta_slots_used > max_slots_needed);
> >>
> >> in "get_next_rx_buffer" ?
> >>
> 
> > That code excerpt is from net_rx_action(),isn't it?
> 
> Yes
> 
> >> I don't know .. at least now it doesn't crash dom0 and therefore not my
> >> complete machine and since tcp is recovering from a failed packet  :-)
> >>
> 
> > Well, if the code calculating max_slots_needed were underestimating then
> the BUG_ON() should fire. If it is not firing in your case then this suggests
> your problem lies elsewhere, or that meta_slots_used is not equal to the
> number of ring slots consumed.
> 
> It's seem to be the last ..
> 
> [ 1157.188908] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 5 npo-
> >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
> >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
> req->gref:657 req->id:7 estimated_slots_needed:4 j(data):1
> reserved_slots_left:-1    used in funcstart: 0 + 1 .. used_dataloop:1 ..
> used_fragloop:3
> [ 1157.244975] vif vif-7-0 vif7.0: ?!? xenvif_rx_action me here 2 ..  vif-
> >rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 sco-
> >meta_slots_used:4 max_upped_gso:1 skb_is_gso(skb):1
> max_slots_needed:4 j:6 is_gso:1 nr_frags:1 firstpart:1 secondpart:2
> reserved_slots_left:-1
> 
> net_rx_action() calculated we would need 4 slots .. and sco-
> >meta_slots_used == 4 when we return so it doesn't trigger you BUG_ON ..
> 
> The 4 slots we calculated are:
>   1 slot for the data part: DIV_ROUND_UP(offset_in_page(skb->data) +
> skb_headlen(skb), PAGE_SIZE)
>   2 slots for the single frag in this SKB from: DIV_ROUND_UP(size, PAGE_SIZE)
>   1 slot since GSO
> 
> In the debug code i annotated all cons++, and the code uses 1 slot to process
> the data from the SKB as expected but uses 3 slots in the frag chopping loop.
> And when it reaches the state  were cons > prod it is always in
> "get_next_rx_buffer".
> 
> >> But probably because "npo->copy_prod++" seems to be used for the
> frags ..
> >> and it isn't added to  npo->meta_prod ?
> >>
> 
> > meta_slots_used is calculated as the value of meta_prod at return (from
> xenvif_gop_skb()) minus the value on entry ,
> > and if you look back up the code then you can see that meta_prod is
> incremented every time RING_GET_REQUEST() is evaluated.
> > So, we must be consuming a slot without evaluating RING_GET_REQUEST()
> and I think that's exactly what's happening...
> > Right at the bottom of xenvif_gop_frag_copy() req_cons is simply
> incremented in the case of a GSO. So the BUG_ON() is indeed off by one.
> 
> That is probably only done on first iteration / frag ?

Yes, the extra slot is accounted for right after the head frag is processed.

  Paul

> Because i don't see my warn there trigger .. but it could be that's because at
> that moment we still have cons <= prod.
> 
> 
> >   Paul
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 18:15                                             ` Paul Durrant
@ 2014-03-26 18:42                                               ` Paul Durrant
  2014-03-26 20:17                                               ` Sander Eikelenboom
  1 sibling, 0 replies; 20+ messages in thread
From: Paul Durrant @ 2014-03-26 18:42 UTC (permalink / raw)
  To: Paul Durrant, Sander Eikelenboom
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Paul Durrant
> Sent: 26 March 2014 18:16
> To: Sander Eikelenboom
> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
> kernel; netdev@vger.kernel.org
> Subject: RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> troubles "bisected"
> 
> > -----Original Message-----
> > From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> > Sent: 26 March 2014 18:08
> > To: Paul Durrant
> > Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> linux-
> > kernel; netdev@vger.kernel.org
> > Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> > troubles "bisected"
> >
> >
> > Wednesday, March 26, 2014, 6:46:06 PM, you wrote:
> >
> > > Re-send shortened version...
> >
> > >> -----Original Message-----
> > >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> > >> Sent: 26 March 2014 16:54
> > >> To: Paul Durrant
> > >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> > linux-
> > >> kernel; netdev@vger.kernel.org
> > >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> > >> troubles "bisected"
> > >>
> > > [snip]
> > >> >>
> > >> >> - When processing an SKB we end up in "xenvif_gop_frag_copy"
> while
> > >> prod
> > >> >> == cons ... but we still have bytes and size left ..
> > >> >> - start_new_rx_buffer() has returned true ..
> > >> >> - so we end up in get_next_rx_buffer
> > >> >> - this does a RING_GET_REQUEST and ups cons ..
> > >> >> - and we end up with a bad grant reference.
> > >> >>
> > >> >> Sometimes we are saved by the bell .. since additional slots have
> > become
> > >> >> free (you see cons become > prod in "get_next_rx_buffer" but
> shortly
> > >> after
> > >> >> that prod is increased ..
> > >> >> just in time to not cause a overrun).
> > >> >>
> > >>
> > >> > Ah, but hang on... There's a BUG_ON meta_slots_used >
> > >> max_slots_needed, so if we are overflowing the worst-case calculation
> > then
> > >> why is that BUG_ON not firing?
> > >>
> > >> You mean:
> > >>                 sco = (struct skb_cb_overlay *)skb->cb;
> > >>                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
> > >>                 BUG_ON(sco->meta_slots_used > max_slots_needed);
> > >>
> > >> in "get_next_rx_buffer" ?
> > >>
> >
> > > That code excerpt is from net_rx_action(),isn't it?
> >
> > Yes
> >
> > >> I don't know .. at least now it doesn't crash dom0 and therefore not my
> > >> complete machine and since tcp is recovering from a failed packet  :-)
> > >>
> >
> > > Well, if the code calculating max_slots_needed were underestimating
> then
> > the BUG_ON() should fire. If it is not firing in your case then this suggests
> > your problem lies elsewhere, or that meta_slots_used is not equal to the
> > number of ring slots consumed.
> >
> > It's seem to be the last ..
> >
> > [ 1157.188908] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 5 npo-
> > >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
> > >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
> > req->gref:657 req->id:7 estimated_slots_needed:4 j(data):1
> > reserved_slots_left:-1    used in funcstart: 0 + 1 .. used_dataloop:1 ..
> > used_fragloop:3
> > [ 1157.244975] vif vif-7-0 vif7.0: ?!? xenvif_rx_action me here 2 ..  vif-
> > >rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 sco-
> > >meta_slots_used:4 max_upped_gso:1 skb_is_gso(skb):1
> > max_slots_needed:4 j:6 is_gso:1 nr_frags:1 firstpart:1 secondpart:2
> > reserved_slots_left:-1
> >
> > net_rx_action() calculated we would need 4 slots .. and sco-
> > >meta_slots_used == 4 when we return so it doesn't trigger you BUG_ON
> ..
> >
> > The 4 slots we calculated are:
> >   1 slot for the data part: DIV_ROUND_UP(offset_in_page(skb->data) +
> > skb_headlen(skb), PAGE_SIZE)
> >   2 slots for the single frag in this SKB from: DIV_ROUND_UP(size,
> PAGE_SIZE)
> >   1 slot since GSO
> >
> > In the debug code i annotated all cons++, and the code uses 1 slot to
> process
> > the data from the SKB as expected but uses 3 slots in the frag chopping
> loop.

So, we must have done something like fill an existing slot, fill the next, but then had some left over requiring a third. Could you try the following additional patch?

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback
index 58e5eb4..dfd8cea 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -499,7 +499,7 @@ static void xenvif_rx_action(struct xenvif *vif)
                for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
                        unsigned int size;
                        size = skb_frag_size(&skb_shinfo(skb)->frags[i]);
-                       max_slots_needed += DIV_ROUND_UP(size, PAGE_SIZE);
+                       max_slots_needed += (size / PAGE_SIZE) + 2;
                }
                if (skb_is_gso(skb) &&
                   (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4 ||

  Paul

> > And when it reaches the state  were cons > prod it is always in
> > "get_next_rx_buffer".
> >
> > >> But probably because "npo->copy_prod++" seems to be used for the
> > frags ..
> > >> and it isn't added to  npo->meta_prod ?
> > >>
> >
> > > meta_slots_used is calculated as the value of meta_prod at return (from
> > xenvif_gop_skb()) minus the value on entry ,
> > > and if you look back up the code then you can see that meta_prod is
> > incremented every time RING_GET_REQUEST() is evaluated.
> > > So, we must be consuming a slot without evaluating
> RING_GET_REQUEST()
> > and I think that's exactly what's happening...
> > > Right at the bottom of xenvif_gop_frag_copy() req_cons is simply
> > incremented in the case of a GSO. So the BUG_ON() is indeed off by one.
> >
> > That is probably only done on first iteration / frag ?
> 
> Yes, the extra slot is accounted for right after the head frag is processed.
> 
>   Paul
> 
> > Because i don't see my warn there trigger .. but it could be that's because
> at
> > that moment we still have cons <= prod.
> >
> >
> > >   Paul
> >
> >
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 17:48                                         ` Paul Durrant
@ 2014-03-26 19:57                                           ` Sander Eikelenboom
  2014-03-27  9:47                                             ` Paul Durrant
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2014-03-26 19:57 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org


Wednesday, March 26, 2014, 6:48:15 PM, you wrote:

>> -----Original Message-----
>> From: Paul Durrant
>> Sent: 26 March 2014 17:47
>> To: 'Sander Eikelenboom'
>> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
>> kernel; netdev@vger.kernel.org
>> Subject: RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> troubles "bisected"
>> 
>> Re-send shortened version...
>> 
>> > -----Original Message-----
>> > From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> > Sent: 26 March 2014 16:54
>> > To: Paul Durrant
>> > Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> linux-
>> > kernel; netdev@vger.kernel.org
>> > Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> > troubles "bisected"
>> >
>> [snip]
>> > >>
>> > >> - When processing an SKB we end up in "xenvif_gop_frag_copy" while
>> > prod
>> > >> == cons ... but we still have bytes and size left ..
>> > >> - start_new_rx_buffer() has returned true ..
>> > >> - so we end up in get_next_rx_buffer
>> > >> - this does a RING_GET_REQUEST and ups cons ..
>> > >> - and we end up with a bad grant reference.
>> > >>
>> > >> Sometimes we are saved by the bell .. since additional slots have
>> become
>> > >> free (you see cons become > prod in "get_next_rx_buffer" but shortly
>> > after
>> > >> that prod is increased ..
>> > >> just in time to not cause a overrun).
>> > >>
>> >
>> > > Ah, but hang on... There's a BUG_ON meta_slots_used >
>> > max_slots_needed, so if we are overflowing the worst-case calculation
>> then
>> > why is that BUG_ON not firing?
>> >
>> > You mean:
>> >                 sco = (struct skb_cb_overlay *)skb->cb;
>> >                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
>> >                 BUG_ON(sco->meta_slots_used > max_slots_needed);
>> >
>> > in "get_next_rx_buffer" ?
>> >
>> 
>> That code excerpt is from net_rx_action(),isn't it?
>> 
>> > I don't know .. at least now it doesn't crash dom0 and therefore not my
>> > complete machine and since tcp is recovering from a failed packet  :-)
>> >
>> 
>> Well, if the code calculating max_slots_needed were underestimating then
>> the BUG_ON() should fire. If it is not firing in your case then this suggests
>> your problem lies elsewhere, or that meta_slots_used is not equal to the
>> number of ring slots consumed.
>> 
>> > But probably because "npo->copy_prod++" seems to be used for the frags
>> ..
>> > and it isn't added to  npo->meta_prod ?
>> >
>> 
>> meta_slots_used is calculated as the value of meta_prod at return (from
>> xenvif_gop_skb()) minus the value on entry , and if you look back up the
>> code then you can see that meta_prod is incremented every time
>> RING_GET_REQUEST() is evaluated. So, we must be consuming a slot without
>> evaluating RING_GET_REQUEST() and I think that's exactly what's
>> happening... Right at the bottom of xenvif_gop_frag_copy() req_cons is
>> simply incremented in the case of a GSO. So the BUG_ON() is indeed off by
>> one.
>> 

> Can you re-test with the following patch applied?

>   Paul

> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback
> index 438d0c0..4f24220 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -482,6 +482,8 @@ static void xenvif_rx_action(struct xenvif *vif)

>         while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) {
>                 RING_IDX max_slots_needed;
> +               RING_IDX old_req_cons;
> +               RING_IDX ring_slots_used;
>                 int i;

>                 /* We need a cheap worse case estimate for the number of
> @@ -511,8 +513,12 @@ static void xenvif_rx_action(struct xenvif *vif)
>                         vif->rx_last_skb_slots = 0;

>                 sco = (struct skb_cb_overlay *)skb->cb;
> +
> +               old_req_cons = vif->rx.req_cons;
>                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
> -               BUG_ON(sco->meta_slots_used > max_slots_needed);
> +               ring_slots_used = vif->rx.req_cons - old_req_cons;
> +
> +               BUG_ON(ring_slots_used > max_slots_needed);

>                 __skb_queue_tail(&rxq, skb);
>         }

That blew pretty fast .. on that BUG_ON

[  290.218182] ------------[ cut here ]------------
[  290.225425] kernel BUG at drivers/net/xen-netback/netback.c:664!
[  290.232717] invalid opcode: 0000 [#1] SMP
[  290.239875] Modules linked in:
[  290.246923] CPU: 0 PID: 10447 Comm: vif7.0 Not tainted 3.13.6-20140326-nbdebug35+ #1
[  290.254040] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[  290.261313] task: ffff880055d16480 ti: ffff88004cb7e000 task.ti: ffff88004cb7e000
[  290.268713] RIP: e030:[<ffffffff81780430>]  [<ffffffff81780430>] xenvif_rx_action+0x1650/0x1670
[  290.276193] RSP: e02b:ffff88004cb7fc28  EFLAGS: 00010202
[  290.283555] RAX: 0000000000000006 RBX: ffff88004c630000 RCX: 3fffffffffffffff
[  290.290908] RDX: 00000000ffffffff RSI: ffff88004c630940 RDI: 0000000000048e7b
[  290.298325] RBP: ffff88004cb7fde8 R08: 0000000000007bc9 R09: 0000000000000005
[  290.305809] R10: ffff88004cb7fd28 R11: ffffc90012690600 R12: 0000000000000004
[  290.313217] R13: ffff8800536a84e0 R14: 0000000000000001 R15: ffff88004c637618
[  290.320521] FS:  00007f1d3030c700(0000) GS:ffff88005f600000(0000) knlGS:0000000000000000
[  290.327839] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[  290.335216] CR2: ffffffffff600400 CR3: 0000000058537000 CR4: 0000000000000660
[  290.342732] Stack:
[  290.350129]  ffff88004cb7fd2c ffff880000000005 ffff88004cb7fd28 ffffffff810f7fc8
[  290.357652]  ffff880055d16b50 ffffffff00000407 ffff880000000000 ffffffff00000000
[  290.365048]  ffff880055d16b50 ffff880000000001 ffff880000000001 ffffffff00000000
[  290.372461] Call Trace:
[  290.379806]  [<ffffffff810f7fc8>] ? __lock_acquire+0x418/0x2220
[  290.387211]  [<ffffffff810df5f6>] ? finish_task_switch+0x46/0xf0
[  290.394552]  [<ffffffff81781400>] xenvif_kthread+0x40/0x190
[  290.401808]  [<ffffffff810f05e0>] ? __init_waitqueue_head+0x60/0x60
[  290.408993]  [<ffffffff817813c0>] ? xenvif_stop_queue+0x60/0x60
[  290.416238]  [<ffffffff810d4f24>] kthread+0xe4/0x100
[  290.423428]  [<ffffffff81b4cf30>] ? _raw_spin_unlock_irq+0x30/0x50
[  290.430615]  [<ffffffff810d4e40>] ? __init_kthread_worker+0x70/0x70
[  290.437793]  [<ffffffff81b4e13c>] ret_from_fork+0x7c/0xb0
[  290.444945]  [<ffffffff810d4e40>] ? __init_kthread_worker+0x70/0x70
[  290.452091] Code: fd ff ff 48 8b b5 f0 fe ff ff 48 c7 c2 10 98 ce 81 31 c0 48 8b be c8 7c 00 00 48 c7 c6 f0 f1 fd 81 e8 35 be 24 00 e9 ba f8 ff ff <0f> 0b 0f 0b 41 bf 01 00 00 00 e9 55 f6 ff ff 0f 0b 66 66 66 66
[  290.467121] RIP  [<ffffffff81780430>] xenvif_rx_action+0x1650/0x1670
[  290.474436]  RSP <ffff88004cb7fc28>
[  290.482400] ---[ end trace 2fcf9e9ae26950b3 ]---

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 18:15                                             ` Paul Durrant
  2014-03-26 18:42                                               ` Paul Durrant
@ 2014-03-26 20:17                                               ` Sander Eikelenboom
  2014-03-27  9:54                                                 ` Paul Durrant
  1 sibling, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2014-03-26 20:17 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org


Wednesday, March 26, 2014, 7:15:30 PM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 26 March 2014 18:08
>> To: Paul Durrant
>> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
>> kernel; netdev@vger.kernel.org
>> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> troubles "bisected"
>> 
>> 
>> Wednesday, March 26, 2014, 6:46:06 PM, you wrote:
>> 
>> > Re-send shortened version...
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 26 March 2014 16:54
>> >> To: Paul Durrant
>> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> linux-
>> >> kernel; netdev@vger.kernel.org
>> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> >> troubles "bisected"
>> >>
>> > [snip]
>> >> >>
>> >> >> - When processing an SKB we end up in "xenvif_gop_frag_copy" while
>> >> prod
>> >> >> == cons ... but we still have bytes and size left ..
>> >> >> - start_new_rx_buffer() has returned true ..
>> >> >> - so we end up in get_next_rx_buffer
>> >> >> - this does a RING_GET_REQUEST and ups cons ..
>> >> >> - and we end up with a bad grant reference.
>> >> >>
>> >> >> Sometimes we are saved by the bell .. since additional slots have
>> become
>> >> >> free (you see cons become > prod in "get_next_rx_buffer" but shortly
>> >> after
>> >> >> that prod is increased ..
>> >> >> just in time to not cause a overrun).
>> >> >>
>> >>
>> >> > Ah, but hang on... There's a BUG_ON meta_slots_used >
>> >> max_slots_needed, so if we are overflowing the worst-case calculation
>> then
>> >> why is that BUG_ON not firing?
>> >>
>> >> You mean:
>> >>                 sco = (struct skb_cb_overlay *)skb->cb;
>> >>                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
>> >>                 BUG_ON(sco->meta_slots_used > max_slots_needed);
>> >>
>> >> in "get_next_rx_buffer" ?
>> >>
>> 
>> > That code excerpt is from net_rx_action(),isn't it?
>> 
>> Yes
>> 
>> >> I don't know .. at least now it doesn't crash dom0 and therefore not my
>> >> complete machine and since tcp is recovering from a failed packet  :-)
>> >>
>> 
>> > Well, if the code calculating max_slots_needed were underestimating then
>> the BUG_ON() should fire. If it is not firing in your case then this suggests
>> your problem lies elsewhere, or that meta_slots_used is not equal to the
>> number of ring slots consumed.
>> 
>> It's seem to be the last ..
>> 
>> [ 1157.188908] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 5 npo-
>> >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
>> >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
>> req->gref:657 req->id:7 estimated_slots_needed:4 j(data):1
>> reserved_slots_left:-1    used in funcstart: 0 + 1 .. used_dataloop:1 ..
>> used_fragloop:3
>> [ 1157.244975] vif vif-7-0 vif7.0: ?!? xenvif_rx_action me here 2 ..  vif-
>> >rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 sco-
>> >meta_slots_used:4 max_upped_gso:1 skb_is_gso(skb):1
>> max_slots_needed:4 j:6 is_gso:1 nr_frags:1 firstpart:1 secondpart:2
>> reserved_slots_left:-1
>> 
>> net_rx_action() calculated we would need 4 slots .. and sco-
>> >meta_slots_used == 4 when we return so it doesn't trigger you BUG_ON ..
>> 
>> The 4 slots we calculated are:
>>   1 slot for the data part: DIV_ROUND_UP(offset_in_page(skb->data) +
>> skb_headlen(skb), PAGE_SIZE)
>>   2 slots for the single frag in this SKB from: DIV_ROUND_UP(size, PAGE_SIZE)
>>   1 slot since GSO
>> 
>> In the debug code i annotated all cons++, and the code uses 1 slot to process
>> the data from the SKB as expected but uses 3 slots in the frag chopping loop.
>> And when it reaches the state  were cons > prod it is always in
>> "get_next_rx_buffer".
>> 
>> >> But probably because "npo->copy_prod++" seems to be used for the
>> frags ..
>> >> and it isn't added to  npo->meta_prod ?
>> >>
>> 
>> > meta_slots_used is calculated as the value of meta_prod at return (from
>> xenvif_gop_skb()) minus the value on entry ,
>> > and if you look back up the code then you can see that meta_prod is
>> incremented every time RING_GET_REQUEST() is evaluated.
>> > So, we must be consuming a slot without evaluating RING_GET_REQUEST()
>> and I think that's exactly what's happening...
>> > Right at the bottom of xenvif_gop_frag_copy() req_cons is simply
>> incremented in the case of a GSO. So the BUG_ON() is indeed off by one.
>> 
>> That is probably only done on first iteration / frag ?

> Yes, the extra slot is accounted for right after the head frag is processed

Ok so we are talking about:

if (*head && ((1 << gso_type) & vif->gso_mask)){
                        vif->rx.req_cons++;

Well it had some debug code in place to detect if that path is taken as well:

[ 1157.095216] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here end npo->meta_prod:40 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 npo->copy_gref:4325379 npo->copy_off:560  MAX_BUFFER_OFFSET:4096 bytes:560 size:0  offset:560 head:1273462060 i:3 vif->rx.sring->req_event:2105868 gso_gaps:0 estimated_slots_needed:4 reserved_slots_left:-1

Well "gso_gaps:0" indicates that in this case that path in "xenvif_gop_frag_copy()" has not been taken in any iteration of that frag.

However i=3 .. so we have done 3 iterations of the loop while we expected to do only 2 ...

So that would mean that somehow the code in "xenvif_gop_frag_copy()" needs more slots
to brake this frag down than the loop with DIV_ROUND_UP(size, PAGE_SIZE) in net_rx_action() has accounted for.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 19:57                                           ` Sander Eikelenboom
@ 2014-03-27  9:47                                             ` Paul Durrant
  2014-03-27 10:00                                               ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Paul Durrant @ 2014-03-27  9:47 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 26 March 2014 19:57
> To: Paul Durrant
> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
> kernel; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> troubles "bisected"
> 
> 
> Wednesday, March 26, 2014, 6:48:15 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Paul Durrant
> >> Sent: 26 March 2014 17:47
> >> To: 'Sander Eikelenboom'
> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> linux-
> >> kernel; netdev@vger.kernel.org
> >> Subject: RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> >> troubles "bisected"
> >>
> >> Re-send shortened version...
> >>
> >> > -----Original Message-----
> >> > From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> > Sent: 26 March 2014 16:54
> >> > To: Paul Durrant
> >> > Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> >> linux-
> >> > kernel; netdev@vger.kernel.org
> >> > Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> >> > troubles "bisected"
> >> >
> >> [snip]
> >> > >>
> >> > >> - When processing an SKB we end up in "xenvif_gop_frag_copy"
> while
> >> > prod
> >> > >> == cons ... but we still have bytes and size left ..
> >> > >> - start_new_rx_buffer() has returned true ..
> >> > >> - so we end up in get_next_rx_buffer
> >> > >> - this does a RING_GET_REQUEST and ups cons ..
> >> > >> - and we end up with a bad grant reference.
> >> > >>
> >> > >> Sometimes we are saved by the bell .. since additional slots have
> >> become
> >> > >> free (you see cons become > prod in "get_next_rx_buffer" but
> shortly
> >> > after
> >> > >> that prod is increased ..
> >> > >> just in time to not cause a overrun).
> >> > >>
> >> >
> >> > > Ah, but hang on... There's a BUG_ON meta_slots_used >
> >> > max_slots_needed, so if we are overflowing the worst-case calculation
> >> then
> >> > why is that BUG_ON not firing?
> >> >
> >> > You mean:
> >> >                 sco = (struct skb_cb_overlay *)skb->cb;
> >> >                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
> >> >                 BUG_ON(sco->meta_slots_used > max_slots_needed);
> >> >
> >> > in "get_next_rx_buffer" ?
> >> >
> >>
> >> That code excerpt is from net_rx_action(),isn't it?
> >>
> >> > I don't know .. at least now it doesn't crash dom0 and therefore not my
> >> > complete machine and since tcp is recovering from a failed packet  :-)
> >> >
> >>
> >> Well, if the code calculating max_slots_needed were underestimating
> then
> >> the BUG_ON() should fire. If it is not firing in your case then this suggests
> >> your problem lies elsewhere, or that meta_slots_used is not equal to the
> >> number of ring slots consumed.
> >>
> >> > But probably because "npo->copy_prod++" seems to be used for the
> frags
> >> ..
> >> > and it isn't added to  npo->meta_prod ?
> >> >
> >>
> >> meta_slots_used is calculated as the value of meta_prod at return (from
> >> xenvif_gop_skb()) minus the value on entry , and if you look back up the
> >> code then you can see that meta_prod is incremented every time
> >> RING_GET_REQUEST() is evaluated. So, we must be consuming a slot
> without
> >> evaluating RING_GET_REQUEST() and I think that's exactly what's
> >> happening... Right at the bottom of xenvif_gop_frag_copy() req_cons is
> >> simply incremented in the case of a GSO. So the BUG_ON() is indeed off
> by
> >> one.
> >>
> 
> > Can you re-test with the following patch applied?
> 
> >   Paul
> 
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
> netback/netback
> > index 438d0c0..4f24220 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -482,6 +482,8 @@ static void xenvif_rx_action(struct xenvif *vif)
> 
> >         while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) {
> >                 RING_IDX max_slots_needed;
> > +               RING_IDX old_req_cons;
> > +               RING_IDX ring_slots_used;
> >                 int i;
> 
> >                 /* We need a cheap worse case estimate for the number of
> > @@ -511,8 +513,12 @@ static void xenvif_rx_action(struct xenvif *vif)
> >                         vif->rx_last_skb_slots = 0;
> 
> >                 sco = (struct skb_cb_overlay *)skb->cb;
> > +
> > +               old_req_cons = vif->rx.req_cons;
> >                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
> > -               BUG_ON(sco->meta_slots_used > max_slots_needed);
> > +               ring_slots_used = vif->rx.req_cons - old_req_cons;
> > +
> > +               BUG_ON(ring_slots_used > max_slots_needed);
> 
> >                 __skb_queue_tail(&rxq, skb);
> >         }
> 
> That blew pretty fast .. on that BUG_ON
> 

Good. That's what should have happened :-)

  Paul

> [  290.218182] ------------[ cut here ]------------
> [  290.225425] kernel BUG at drivers/net/xen-netback/netback.c:664!
> [  290.232717] invalid opcode: 0000 [#1] SMP
> [  290.239875] Modules linked in:
> [  290.246923] CPU: 0 PID: 10447 Comm: vif7.0 Not tainted 3.13.6-20140326-
> nbdebug35+ #1
> [  290.254040] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS
> V1.8B1 09/13/2010
> [  290.261313] task: ffff880055d16480 ti: ffff88004cb7e000 task.ti:
> ffff88004cb7e000
> [  290.268713] RIP: e030:[<ffffffff81780430>]  [<ffffffff81780430>]
> xenvif_rx_action+0x1650/0x1670
> [  290.276193] RSP: e02b:ffff88004cb7fc28  EFLAGS: 00010202
> [  290.283555] RAX: 0000000000000006 RBX: ffff88004c630000 RCX:
> 3fffffffffffffff
> [  290.290908] RDX: 00000000ffffffff RSI: ffff88004c630940 RDI:
> 0000000000048e7b
> [  290.298325] RBP: ffff88004cb7fde8 R08: 0000000000007bc9 R09:
> 0000000000000005
> [  290.305809] R10: ffff88004cb7fd28 R11: ffffc90012690600 R12:
> 0000000000000004
> [  290.313217] R13: ffff8800536a84e0 R14: 0000000000000001 R15:
> ffff88004c637618
> [  290.320521] FS:  00007f1d3030c700(0000) GS:ffff88005f600000(0000)
> knlGS:0000000000000000
> [  290.327839] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  290.335216] CR2: ffffffffff600400 CR3: 0000000058537000 CR4:
> 0000000000000660
> [  290.342732] Stack:
> [  290.350129]  ffff88004cb7fd2c ffff880000000005 ffff88004cb7fd28
> ffffffff810f7fc8
> [  290.357652]  ffff880055d16b50 ffffffff00000407 ffff880000000000
> ffffffff00000000
> [  290.365048]  ffff880055d16b50 ffff880000000001 ffff880000000001
> ffffffff00000000
> [  290.372461] Call Trace:
> [  290.379806]  [<ffffffff810f7fc8>] ? __lock_acquire+0x418/0x2220
> [  290.387211]  [<ffffffff810df5f6>] ? finish_task_switch+0x46/0xf0
> [  290.394552]  [<ffffffff81781400>] xenvif_kthread+0x40/0x190
> [  290.401808]  [<ffffffff810f05e0>] ? __init_waitqueue_head+0x60/0x60
> [  290.408993]  [<ffffffff817813c0>] ? xenvif_stop_queue+0x60/0x60
> [  290.416238]  [<ffffffff810d4f24>] kthread+0xe4/0x100
> [  290.423428]  [<ffffffff81b4cf30>] ? _raw_spin_unlock_irq+0x30/0x50
> [  290.430615]  [<ffffffff810d4e40>] ? __init_kthread_worker+0x70/0x70
> [  290.437793]  [<ffffffff81b4e13c>] ret_from_fork+0x7c/0xb0
> [  290.444945]  [<ffffffff810d4e40>] ? __init_kthread_worker+0x70/0x70
> [  290.452091] Code: fd ff ff 48 8b b5 f0 fe ff ff 48 c7 c2 10 98 ce 81 31 c0 48 8b
> be c8 7c 00 00 48 c7 c6 f0 f1 fd 81 e8 35 be 24 00 e9 ba f8 ff ff <0f> 0b 0f 0b 41
> bf 01 00 00 00 e9 55 f6 ff ff 0f 0b 66 66 66 66
> [  290.467121] RIP  [<ffffffff81780430>] xenvif_rx_action+0x1650/0x1670
> [  290.474436]  RSP <ffff88004cb7fc28>
> [  290.482400] ---[ end trace 2fcf9e9ae26950b3 ]---

^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-26 20:17                                               ` Sander Eikelenboom
@ 2014-03-27  9:54                                                 ` Paul Durrant
  2014-03-27 10:05                                                   ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Paul Durrant @ 2014-03-27  9:54 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

> -----Original Message-----
> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> Sent: 26 March 2014 20:18
> To: Paul Durrant
> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
> kernel; netdev@vger.kernel.org
> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> troubles "bisected"
> 
> 
> Wednesday, March 26, 2014, 7:15:30 PM, you wrote:
> 
> >> -----Original Message-----
> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> Sent: 26 March 2014 18:08
> >> To: Paul Durrant
> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> linux-
> >> kernel; netdev@vger.kernel.org
> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> >> troubles "bisected"
> >>
> >>
> >> Wednesday, March 26, 2014, 6:46:06 PM, you wrote:
> >>
> >> > Re-send shortened version...
> >>
> >> >> -----Original Message-----
> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
> >> >> Sent: 26 March 2014 16:54
> >> >> To: Paul Durrant
> >> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
> >> linux-
> >> >> kernel; netdev@vger.kernel.org
> >> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
> >> >> troubles "bisected"
> >> >>
> >> > [snip]
> >> >> >>
> >> >> >> - When processing an SKB we end up in "xenvif_gop_frag_copy"
> while
> >> >> prod
> >> >> >> == cons ... but we still have bytes and size left ..
> >> >> >> - start_new_rx_buffer() has returned true ..
> >> >> >> - so we end up in get_next_rx_buffer
> >> >> >> - this does a RING_GET_REQUEST and ups cons ..
> >> >> >> - and we end up with a bad grant reference.
> >> >> >>
> >> >> >> Sometimes we are saved by the bell .. since additional slots have
> >> become
> >> >> >> free (you see cons become > prod in "get_next_rx_buffer" but
> shortly
> >> >> after
> >> >> >> that prod is increased ..
> >> >> >> just in time to not cause a overrun).
> >> >> >>
> >> >>
> >> >> > Ah, but hang on... There's a BUG_ON meta_slots_used >
> >> >> max_slots_needed, so if we are overflowing the worst-case calculation
> >> then
> >> >> why is that BUG_ON not firing?
> >> >>
> >> >> You mean:
> >> >>                 sco = (struct skb_cb_overlay *)skb->cb;
> >> >>                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
> >> >>                 BUG_ON(sco->meta_slots_used > max_slots_needed);
> >> >>
> >> >> in "get_next_rx_buffer" ?
> >> >>
> >>
> >> > That code excerpt is from net_rx_action(),isn't it?
> >>
> >> Yes
> >>
> >> >> I don't know .. at least now it doesn't crash dom0 and therefore not my
> >> >> complete machine and since tcp is recovering from a failed packet  :-)
> >> >>
> >>
> >> > Well, if the code calculating max_slots_needed were underestimating
> then
> >> the BUG_ON() should fire. If it is not firing in your case then this suggests
> >> your problem lies elsewhere, or that meta_slots_used is not equal to the
> >> number of ring slots consumed.
> >>
> >> It's seem to be the last ..
> >>
> >> [ 1157.188908] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 5 npo-
> >> >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
> >> >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
> >> req->gref:657 req->id:7 estimated_slots_needed:4 j(data):1
> >> reserved_slots_left:-1    used in funcstart: 0 + 1 .. used_dataloop:1 ..
> >> used_fragloop:3
> >> [ 1157.244975] vif vif-7-0 vif7.0: ?!? xenvif_rx_action me here 2 ..  vif-
> >> >rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 sco-
> >> >meta_slots_used:4 max_upped_gso:1 skb_is_gso(skb):1
> >> max_slots_needed:4 j:6 is_gso:1 nr_frags:1 firstpart:1 secondpart:2
> >> reserved_slots_left:-1
> >>
> >> net_rx_action() calculated we would need 4 slots .. and sco-
> >> >meta_slots_used == 4 when we return so it doesn't trigger you BUG_ON
> ..
> >>
> >> The 4 slots we calculated are:
> >>   1 slot for the data part: DIV_ROUND_UP(offset_in_page(skb->data) +
> >> skb_headlen(skb), PAGE_SIZE)
> >>   2 slots for the single frag in this SKB from: DIV_ROUND_UP(size,
> PAGE_SIZE)
> >>   1 slot since GSO
> >>
> >> In the debug code i annotated all cons++, and the code uses 1 slot to
> process
> >> the data from the SKB as expected but uses 3 slots in the frag chopping
> loop.
> >> And when it reaches the state  were cons > prod it is always in
> >> "get_next_rx_buffer".
> >>
> >> >> But probably because "npo->copy_prod++" seems to be used for the
> >> frags ..
> >> >> and it isn't added to  npo->meta_prod ?
> >> >>
> >>
> >> > meta_slots_used is calculated as the value of meta_prod at return
> (from
> >> xenvif_gop_skb()) minus the value on entry ,
> >> > and if you look back up the code then you can see that meta_prod is
> >> incremented every time RING_GET_REQUEST() is evaluated.
> >> > So, we must be consuming a slot without evaluating
> RING_GET_REQUEST()
> >> and I think that's exactly what's happening...
> >> > Right at the bottom of xenvif_gop_frag_copy() req_cons is simply
> >> incremented in the case of a GSO. So the BUG_ON() is indeed off by one.
> >>
> >> That is probably only done on first iteration / frag ?
> 
> > Yes, the extra slot is accounted for right after the head frag is processed
> 
> Ok so we are talking about:
> 
> if (*head && ((1 << gso_type) & vif->gso_mask)){
>                         vif->rx.req_cons++;
> 
> Well it had some debug code in place to detect if that path is taken as well:
> 
> [ 1157.095216] vif vif-7-0 vif7.0: ?!? xenvif_gop_frag_copy Me here end npo-
> >meta_prod:40 vif->rx.sring->req_prod:2105867 vif->rx.req_cons:2105868
> npo->copy_gref:4325379 npo->copy_off:560  MAX_BUFFER_OFFSET:4096
> bytes:560 size:0  offset:560 head:1273462060 i:3 vif->rx.sring-
> >req_event:2105868 gso_gaps:0 estimated_slots_needed:4
> reserved_slots_left:-1
> 
> Well "gso_gaps:0" indicates that in this case that path in
> "xenvif_gop_frag_copy()" has not been taken in any iteration of that frag.
> 
> However i=3 .. so we have done 3 iterations of the loop while we expected
> to do only 2 ...
> 
> So that would mean that somehow the code in "xenvif_gop_frag_copy()"
> needs more slots
> to brake this frag down than the loop with DIV_ROUND_UP(size, PAGE_SIZE)
> in net_rx_action() has accounted for.
> 

Yes. The code is not assuming worst-case page-spanning and it looks like it needs to.

I also notice a bogus clause in this if statement in start_new_rx_buffer():

        if ((offset + size > MAX_BUFFER_OFFSET) &&
            (size <= MAX_BUFFER_OFFSET) && offset && !head)
                return true;

MAX_BUFFER_OFFSET is defined to be PAGE_SIZE and xenvif_gop_frag_copy() never passes a value of size > PAGE_SIZE, so that 2nd clause is completely pointless.
I'll come up with some patches shortly.

  Paul

> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-27  9:47                                             ` Paul Durrant
@ 2014-03-27 10:00                                               ` Sander Eikelenboom
  0 siblings, 0 replies; 20+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 10:00 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org


Thursday, March 27, 2014, 10:47:02 AM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 26 March 2014 19:57
>> To: Paul Durrant
>> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
>> kernel; netdev@vger.kernel.org
>> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> troubles "bisected"
>> 
>> 
>> Wednesday, March 26, 2014, 6:48:15 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Paul Durrant
>> >> Sent: 26 March 2014 17:47
>> >> To: 'Sander Eikelenboom'
>> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> linux-
>> >> kernel; netdev@vger.kernel.org
>> >> Subject: RE: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> >> troubles "bisected"
>> >>
>> >> Re-send shortened version...
>> >>
>> >> > -----Original Message-----
>> >> > From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> > Sent: 26 March 2014 16:54
>> >> > To: Paul Durrant
>> >> > Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> >> linux-
>> >> > kernel; netdev@vger.kernel.org
>> >> > Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> >> > troubles "bisected"
>> >> >
>> >> [snip]
>> >> > >>
>> >> > >> - When processing an SKB we end up in "xenvif_gop_frag_copy"
>> while
>> >> > prod
>> >> > >> == cons ... but we still have bytes and size left ..
>> >> > >> - start_new_rx_buffer() has returned true ..
>> >> > >> - so we end up in get_next_rx_buffer
>> >> > >> - this does a RING_GET_REQUEST and ups cons ..
>> >> > >> - and we end up with a bad grant reference.
>> >> > >>
>> >> > >> Sometimes we are saved by the bell .. since additional slots have
>> >> become
>> >> > >> free (you see cons become > prod in "get_next_rx_buffer" but
>> shortly
>> >> > after
>> >> > >> that prod is increased ..
>> >> > >> just in time to not cause a overrun).
>> >> > >>
>> >> >
>> >> > > Ah, but hang on... There's a BUG_ON meta_slots_used >
>> >> > max_slots_needed, so if we are overflowing the worst-case calculation
>> >> then
>> >> > why is that BUG_ON not firing?
>> >> >
>> >> > You mean:
>> >> >                 sco = (struct skb_cb_overlay *)skb->cb;
>> >> >                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
>> >> >                 BUG_ON(sco->meta_slots_used > max_slots_needed);
>> >> >
>> >> > in "get_next_rx_buffer" ?
>> >> >
>> >>
>> >> That code excerpt is from net_rx_action(),isn't it?
>> >>
>> >> > I don't know .. at least now it doesn't crash dom0 and therefore not my
>> >> > complete machine and since tcp is recovering from a failed packet  :-)
>> >> >
>> >>
>> >> Well, if the code calculating max_slots_needed were underestimating
>> then
>> >> the BUG_ON() should fire. If it is not firing in your case then this suggests
>> >> your problem lies elsewhere, or that meta_slots_used is not equal to the
>> >> number of ring slots consumed.
>> >>
>> >> > But probably because "npo->copy_prod++" seems to be used for the
>> frags
>> >> ..
>> >> > and it isn't added to  npo->meta_prod ?
>> >> >
>> >>
>> >> meta_slots_used is calculated as the value of meta_prod at return (from
>> >> xenvif_gop_skb()) minus the value on entry , and if you look back up the
>> >> code then you can see that meta_prod is incremented every time
>> >> RING_GET_REQUEST() is evaluated. So, we must be consuming a slot
>> without
>> >> evaluating RING_GET_REQUEST() and I think that's exactly what's
>> >> happening... Right at the bottom of xenvif_gop_frag_copy() req_cons is
>> >> simply incremented in the case of a GSO. So the BUG_ON() is indeed off
>> by
>> >> one.
>> >>
>> 
>> > Can you re-test with the following patch applied?
>> 
>> >   Paul
>> 
>> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-
>> netback/netback
>> > index 438d0c0..4f24220 100644
>> > --- a/drivers/net/xen-netback/netback.c
>> > +++ b/drivers/net/xen-netback/netback.c
>> > @@ -482,6 +482,8 @@ static void xenvif_rx_action(struct xenvif *vif)
>> 
>> >         while ((skb = skb_dequeue(&vif->rx_queue)) != NULL) {
>> >                 RING_IDX max_slots_needed;
>> > +               RING_IDX old_req_cons;
>> > +               RING_IDX ring_slots_used;
>> >                 int i;
>> 
>> >                 /* We need a cheap worse case estimate for the number of
>> > @@ -511,8 +513,12 @@ static void xenvif_rx_action(struct xenvif *vif)
>> >                         vif->rx_last_skb_slots = 0;
>> 
>> >                 sco = (struct skb_cb_overlay *)skb->cb;
>> > +
>> > +               old_req_cons = vif->rx.req_cons;
>> >                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
>> > -               BUG_ON(sco->meta_slots_used > max_slots_needed);
>> > +               ring_slots_used = vif->rx.req_cons - old_req_cons;
>> > +
>> > +               BUG_ON(ring_slots_used > max_slots_needed);
>> 
>> >                 __skb_queue_tail(&rxq, skb);
>> >         }
>> 
>> That blew pretty fast .. on that BUG_ON
>> 

> Good. That's what should have happened :-)

Yes .. and No ..
We shouldn't be there in the first place :-)

Since now every miscalculation in the needed slots leads to a nice remote DOS attack ..
(since we now crash the vif kthread)
it would be nice to have a worst case slot calculation .. with some theoretical guarantees

--
Sander

>   Paul

>> [  290.218182] ------------[ cut here ]------------
>> [  290.225425] kernel BUG at drivers/net/xen-netback/netback.c:664!
>> [  290.232717] invalid opcode: 0000 [#1] SMP
>> [  290.239875] Modules linked in:
>> [  290.246923] CPU: 0 PID: 10447 Comm: vif7.0 Not tainted 3.13.6-20140326-
>> nbdebug35+ #1
>> [  290.254040] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS
>> V1.8B1 09/13/2010
>> [  290.261313] task: ffff880055d16480 ti: ffff88004cb7e000 task.ti:
>> ffff88004cb7e000
>> [  290.268713] RIP: e030:[<ffffffff81780430>]  [<ffffffff81780430>]
>> xenvif_rx_action+0x1650/0x1670
>> [  290.276193] RSP: e02b:ffff88004cb7fc28  EFLAGS: 00010202
>> [  290.283555] RAX: 0000000000000006 RBX: ffff88004c630000 RCX:
>> 3fffffffffffffff
>> [  290.290908] RDX: 00000000ffffffff RSI: ffff88004c630940 RDI:
>> 0000000000048e7b
>> [  290.298325] RBP: ffff88004cb7fde8 R08: 0000000000007bc9 R09:
>> 0000000000000005
>> [  290.305809] R10: ffff88004cb7fd28 R11: ffffc90012690600 R12:
>> 0000000000000004
>> [  290.313217] R13: ffff8800536a84e0 R14: 0000000000000001 R15:
>> ffff88004c637618
>> [  290.320521] FS:  00007f1d3030c700(0000) GS:ffff88005f600000(0000)
>> knlGS:0000000000000000
>> [  290.327839] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [  290.335216] CR2: ffffffffff600400 CR3: 0000000058537000 CR4:
>> 0000000000000660
>> [  290.342732] Stack:
>> [  290.350129]  ffff88004cb7fd2c ffff880000000005 ffff88004cb7fd28
>> ffffffff810f7fc8
>> [  290.357652]  ffff880055d16b50 ffffffff00000407 ffff880000000000
>> ffffffff00000000
>> [  290.365048]  ffff880055d16b50 ffff880000000001 ffff880000000001
>> ffffffff00000000
>> [  290.372461] Call Trace:
>> [  290.379806]  [<ffffffff810f7fc8>] ? __lock_acquire+0x418/0x2220
>> [  290.387211]  [<ffffffff810df5f6>] ? finish_task_switch+0x46/0xf0
>> [  290.394552]  [<ffffffff81781400>] xenvif_kthread+0x40/0x190
>> [  290.401808]  [<ffffffff810f05e0>] ? __init_waitqueue_head+0x60/0x60
>> [  290.408993]  [<ffffffff817813c0>] ? xenvif_stop_queue+0x60/0x60
>> [  290.416238]  [<ffffffff810d4f24>] kthread+0xe4/0x100
>> [  290.423428]  [<ffffffff81b4cf30>] ? _raw_spin_unlock_irq+0x30/0x50
>> [  290.430615]  [<ffffffff810d4e40>] ? __init_kthread_worker+0x70/0x70
>> [  290.437793]  [<ffffffff81b4e13c>] ret_from_fork+0x7c/0xb0
>> [  290.444945]  [<ffffffff810d4e40>] ? __init_kthread_worker+0x70/0x70
>> [  290.452091] Code: fd ff ff 48 8b b5 f0 fe ff ff 48 c7 c2 10 98 ce 81 31 c0 48 8b
>> be c8 7c 00 00 48 c7 c6 f0 f1 fd 81 e8 35 be 24 00 e9 ba f8 ff ff <0f> 0b 0f 0b 41
>> bf 01 00 00 00 e9 55 f6 ff ff 0f 0b 66 66 66 66
>> [  290.467121] RIP  [<ffffffff81780430>] xenvif_rx_action+0x1650/0x1670
>> [  290.474436]  RSP <ffff88004cb7fc28>
>> [  290.482400] ---[ end trace 2fcf9e9ae26950b3 ]---

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected"
  2014-03-27  9:54                                                 ` Paul Durrant
@ 2014-03-27 10:05                                                   ` Sander Eikelenboom
  0 siblings, 0 replies; 20+ messages in thread
From: Sander Eikelenboom @ 2014-03-27 10:05 UTC (permalink / raw)
  To: Paul Durrant
  Cc: Wei Liu, annie li, Zoltan Kiss, xen-devel@lists.xen.org,
	Ian Campbell, linux-kernel, netdev@vger.kernel.org

Hrmm i don't know if it's your mailer or my mailer .. but i seem to get a lot of your mails truncated somehow :S
though the xen-devel list archive seem to have them in complete form .. so it's probably my mailer tripping over something

> I'll come up with some patches shortly.

OK will test them ASAP.


Thursday, March 27, 2014, 10:54:09 AM, you wrote:

>> -----Original Message-----
>> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> Sent: 26 March 2014 20:18
>> To: Paul Durrant
>> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell; linux-
>> kernel; netdev@vger.kernel.org
>> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> troubles "bisected"
>> 
>> 
>> Wednesday, March 26, 2014, 7:15:30 PM, you wrote:
>> 
>> >> -----Original Message-----
>> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> Sent: 26 March 2014 18:08
>> >> To: Paul Durrant
>> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> linux-
>> >> kernel; netdev@vger.kernel.org
>> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> >> troubles "bisected"
>> >>
>> >>
>> >> Wednesday, March 26, 2014, 6:46:06 PM, you wrote:
>> >>
>> >> > Re-send shortened version...
>> >>
>> >> >> -----Original Message-----
>> >> >> From: Sander Eikelenboom [mailto:linux@eikelenboom.it]
>> >> >> Sent: 26 March 2014 16:54
>> >> >> To: Paul Durrant
>> >> >> Cc: Wei Liu; annie li; Zoltan Kiss; xen-devel@lists.xen.org; Ian Campbell;
>> >> linux-
>> >> >> kernel; netdev@vger.kernel.org
>> >> >> Subject: Re: [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network
>> >> >> troubles "bisected"
>> >> >>
>> >> > [snip]
>> >> >> >>
>> >> >> >> - When processing an SKB we end up in "xenvif_gop_frag_copy"
>> while
>> >> >> prod
>> >> >> >> == cons ... but we still have bytes and size left ..
>> >> >> >> - start_new_rx_buffer() has returned true ..
>> >> >> >> - so we end up in get_next_rx_buffer
>> >> >> >> - this does a RING_GET_REQUEST and ups cons ..
>> >> >> >> - and we end up with a bad grant reference.
>> >> >> >>
>> >> >> >> Sometimes we are saved by the bell .. since additional slots have
>> >> become
>> >> >> >> free (you see cons become > prod in "get_next_rx_buffer" but
>> shortly
>> >> >> after
>> >> >> >> that prod is increased ..
>> >> >> >> just in time to not cause a overrun).
>> >> >> >>
>> >> >>
>> >> >> > Ah, but hang on... There's a BUG_ON meta_slots_used >
>> >> >> max_slots_needed, so if we are overflowing the worst-case calculation
>> >> then
>> >> >> why is that BUG_ON not firing?
>> >> >>
>> >> >> You mean:
>> >> >>                 sco = (struct skb_cb_overlay *)skb->cb;
>> >> >>                 sco->meta_slots_used = xenvif_gop_skb(skb, &npo);
>> >> >>                 BUG_ON(sco->meta_slots_used > max_slots_needed);
>> >> >>
>> >> >> in "get_next_rx_buffer" ?
>> >> >>
>> >>
>> >> > That code excerpt is from net_rx_action(),isn't it?
>> >>
>> >> Yes
>> >>
>> >> >> I don't know .. at least now it doesn't crash dom0 and therefore not my
>> >> >> complete machine and since tcp is recovering from a failed packet  :-)
>> >> >>
>> >>
>> >> > Well, if the code calculating max_slots_needed were underestimating
>> then
>> >> the BUG_ON() should fire. If it is not firing in your case then this suggests
>> >> your problem lies elsewhere, or that meta_slots_used is not equal to the
>> >> number of ring slots consumed.
>> >>
>> >> It's seem to be the last ..
>> >>
>> >> [ 1157.188908] vif vif-7-0 vif7.0: ?!? xenvif_gop_skb Me here 5 npo-
>> >> >meta_prod:40 old_meta_prod:36 vif->rx.sring->req_prod:2105867 vif-
>> >> >rx.req_cons:2105868 meta->gso_type:1 meta->gso_size:1448 nr_frags:1
>> >> req->gref:657 req->id:7 estimated_slots_needed:4 j(data):1
>> >> reserved_slots_left:-1    used in funcstart: 0 + 1 .. used_dataloop:1 ..
>> >> used_fragloop:3
>> >> [ 1157.244975] vif vif-7-0 vif7.0: ?!? xenvif_rx_action me here 2 ..  vif-
>> >> >rx.sring->req_prod:2105867 vif->rx.req_cons:2105868 sco-
>> >> >meta_slots_used:4 max_upped_gso:1 skb_is_gso(skb):1
>> >> max_slots_needed:4 j:6 is_gso:1 nr_frags:1 firstpart:1 secondpart:2
>> >> reserved_slots_left:-1
>> >>
>> >> net_rx_action() calculated we would need 4 slots .. and sco-
>> >> >meta_slots_used == 4 when we return so it doesn't trigger you BUG_ON
>> ..
>> >>
>> >> The 4 slots we calculated are:
>> >>   1 slot for the data part: DIV_ROUND_UP(offset_in_page(skb->data) +
>> >> skb_headlen(skb), PAGE_SIZE)
>> >>   2 slots for the single frag in this SKB from: DIV_ROUND_UP(size,
>> PAGE_SIZE)
>> >>   1 slot since GSO
>> >>
>> >> In the debug code i annotated all cons++, and the code uses 1 slot to
>> process
>> >> the data from the SKB as expected but uses 3 slots in the frag chopping
>> loop.
>> >> And when it reaches the state  were cons > prod it is always in
>> >> "get_next_rx_buffer".
>> >>
>> >> >> But probably because "npo->copy_prod++" seems to be used for the
>> >> frags ..
>> >> >> and it isn't added to  npo->meta_prod ?
>> >> >>
>> >>
>> >> > meta_slots_used is calculated as the value of meta_prod at return
>> (from
>> >> xenvif_gop_skb()) minus the value on entry ,
>> >> > and if you look back up the code then you can see that meta_prod is
>> >> incremented every time RING_GET_REQUEST() is evaluated.
>> >> > So, we must be consuming a slot without evaluating
>> RING_GET_REQUEST()
>> >> and I think that's exactly what's happening...
>> >> > Right at the bottom of xenvif_gop_frag_copy() req_cons is simply
>> >> incremented in the case of a GSO. So the BUG_ON() is indeed off by one

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-03-27 10:05 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1744594108.20140318162127@eikelenboom.it>
     [not found] ` <20140318160412.GB16807@zion.uk.xensource.com>
     [not found]   ` <1701035622.20140318211402@eikelenboom.it>
     [not found]     ` <722971844.20140318221859@eikelenboom.it>
     [not found]       ` <1688396550.20140319001104@eikelenboom.it>
     [not found]         ` <20140319113532.GD16807@zion.uk.xensource.com>
     [not found]           ` <246793256.20140319220752@eikelenboom.it>
     [not found]             ` <20140321164958.GA31766@zion.uk.xensource.com>
     [not found]               ` <1334202265.20140321182727@eikelenboom.it>
     [not found]                 ` <1056661597.20140322192834@eikelenboom.it>
     [not found]                   ` <20140325151539.GG31766@zion.uk.xensource.com>
     [not found]                     ` <79975567.20140325162942@eikelenboom.it>
2014-03-26 11:11                       ` [Xen-devel] Xen-unstable Linux 3.14-rc3 and 3.13 Network troubles "bisected" Sander Eikelenboom
2014-03-26 14:44                         ` Paul Durrant
2014-03-26 15:22                           ` Sander Eikelenboom
2014-03-26 15:50                             ` Paul Durrant
2014-03-26 16:06                               ` Sander Eikelenboom
2014-03-26 16:25                                 ` Paul Durrant
2014-03-26 16:53                                   ` Sander Eikelenboom
2014-03-26 17:16                                     ` Paul Durrant
2014-03-26 17:33                                       ` Sander Eikelenboom
2014-03-26 17:46                                         ` Paul Durrant
2014-03-26 18:07                                           ` Sander Eikelenboom
2014-03-26 18:15                                             ` Paul Durrant
2014-03-26 18:42                                               ` Paul Durrant
2014-03-26 20:17                                               ` Sander Eikelenboom
2014-03-27  9:54                                                 ` Paul Durrant
2014-03-27 10:05                                                   ` Sander Eikelenboom
2014-03-26 17:48                                         ` Paul Durrant
2014-03-26 19:57                                           ` Sander Eikelenboom
2014-03-27  9:47                                             ` Paul Durrant
2014-03-27 10:00                                               ` Sander Eikelenboom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).