* pegged softirq and NAPI race (?)
@ 2018-09-18 8:41 Song Liu
2018-09-18 13:45 ` Eric Dumazet
0 siblings, 1 reply; 22+ messages in thread
From: Song Liu @ 2018-09-18 8:41 UTC (permalink / raw)
To: Networking
Cc: edumazet@google.com, jeffrey.t.kirsher@intel.com,
alexander.h.duyck@intel.com, Michael Chan, Kernel Team
We are debugging this issue that netconsole message triggers pegged softirq
(ksoftirqd taking 100% CPU for many seconds). We found this issue in
production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily
reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more
detail).
After debugging for some time, we found that this issue is likely related
to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit,
the steps described in [1] cannot reproduce the issue on ixgbe. Reverting
this commit also reduces the chances we hit the issue with bnxt (it still
happens with a lower rate).
I tried to fix this issue with relaxed variant (or older version) of
napi_schedule_prep() in netpoll, just like the one on napi_watchdog().
However, my tests do not always go as expected.
Please share your comments/suggestions on which direction shall we try
to fix this.
Thanks in advance!
Song
[1] https://www.spinics.net/lists/netdev/msg522328.html
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 8:41 pegged softirq and NAPI race (?) Song Liu
@ 2018-09-18 13:45 ` Eric Dumazet
2018-09-18 16:19 ` Song Liu
2018-09-18 17:51 ` Alexei Starovoitov
0 siblings, 2 replies; 22+ messages in thread
From: Eric Dumazet @ 2018-09-18 13:45 UTC (permalink / raw)
To: songliubraving
Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team
On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote:
>
> We are debugging this issue that netconsole message triggers pegged softirq
> (ksoftirqd taking 100% CPU for many seconds). We found this issue in
> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily
> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more
> detail).
>
> After debugging for some time, we found that this issue is likely related
> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit,
> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting
> this commit also reduces the chances we hit the issue with bnxt (it still
> happens with a lower rate).
>
> I tried to fix this issue with relaxed variant (or older version) of
> napi_schedule_prep() in netpoll, just like the one on napi_watchdog().
> However, my tests do not always go as expected.
>
> Please share your comments/suggestions on which direction shall we try
> to fix this.
>
> Thanks in advance!
> Song
>
>
> [1] https://www.spinics.net/lists/netdev/msg522328.html
You have not traced ixgbe to understand why driver hits
"clean_complete=false" all the time ?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 13:45 ` Eric Dumazet
@ 2018-09-18 16:19 ` Song Liu
2018-09-18 16:31 ` Rik van Riel
2018-09-18 16:33 ` Eric Dumazet
2018-09-18 17:51 ` Alexei Starovoitov
1 sibling, 2 replies; 22+ messages in thread
From: Song Liu @ 2018-09-18 16:19 UTC (permalink / raw)
To: Eric Dumazet
Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com,
Kernel Team
> On Sep 18, 2018, at 6:45 AM, Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote:
>>
>> We are debugging this issue that netconsole message triggers pegged softirq
>> (ksoftirqd taking 100% CPU for many seconds). We found this issue in
>> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily
>> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more
>> detail).
>>
>> After debugging for some time, we found that this issue is likely related
>> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit,
>> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting
>> this commit also reduces the chances we hit the issue with bnxt (it still
>> happens with a lower rate).
>>
>> I tried to fix this issue with relaxed variant (or older version) of
>> napi_schedule_prep() in netpoll, just like the one on napi_watchdog().
>> However, my tests do not always go as expected.
>>
>> Please share your comments/suggestions on which direction shall we try
>> to fix this.
>>
>> Thanks in advance!
>> Song
>>
>>
>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e=
>
> You have not traced ixgbe to understand why driver hits
> "clean_complete=false" all the time ?
The trace showed that we got "clean_complete=false" because
ixgbe_clean_rx_irq() used all budget (64). It feels like the driver
is tricked to process old data on the rx_ring for one more time.
Have you seen similar issue?
Thanks,
Song
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 16:19 ` Song Liu
@ 2018-09-18 16:31 ` Rik van Riel
2018-09-18 16:33 ` Eric Dumazet
1 sibling, 0 replies; 22+ messages in thread
From: Rik van Riel @ 2018-09-18 16:31 UTC (permalink / raw)
To: Song Liu, Eric Dumazet
Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com,
Kernel Team
On Tue, 2018-09-18 at 12:19 -0400, Song Liu wrote:
> > On Sep 18, 2018, at 6:45 AM, Eric Dumazet <edumazet@google.com>
> > wrote:
> >
> > On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com>
> > wrote:
> > >
> > > We are debugging this issue that netconsole message triggers
> > > pegged softirq
> > > (ksoftirqd taking 100% CPU for many seconds). We found this issue
> > > in
> > > production with both bnxt and ixgbe, on a 4.11 based kernel. This
> > > is easily
> > > reproducible with ixgbe on 4.11, and latest net/net-next (see [1]
> > > for more
> > > detail).
> > >
> > > After debugging for some time, we found that this issue is likely
> > > related
> > > to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this
> > > commit,
> > > the steps described in [1] cannot reproduce the issue on ixgbe.
> > > Reverting
> > > this commit also reduces the chances we hit the issue with bnxt
> > > (it still
> > > happens with a lower rate).
> > >
> > > I tried to fix this issue with relaxed variant (or older version)
> > > of
> > > napi_schedule_prep() in netpoll, just like the one on
> > > napi_watchdog().
> > > However, my tests do not always go as expected.
> > >
> > > Please share your comments/suggestions on which direction shall
> > > we try
> > > to fix this.
> > >
> > > Thanks in advance!
> > > Song
> > >
> > >
> > > [1]
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e=
> >
> > You have not traced ixgbe to understand why driver hits
> > "clean_complete=false" all the time ?
>
> The trace showed that we got "clean_complete=false" because
> ixgbe_clean_rx_irq() used all budget (64). It feels like the driver
> is tricked to process old data on the rx_ring for one more time.
>
> Have you seen similar issue?
A quick reading of the code suggests that means
polling cannot keep up with the rate of incoming
packets.
That should not be a surprise, given that polling
appears to happen on just one CPU, while interrupt
driven packet delivery was fanned out across a
larger number of CPUs.
Does the NAPI code have any way in which it
periodically force-returns to IRQ mode, because
multiple CPUs in IRQ mode can keep up with packets
better than a single CPU in polling mode?
Alternatively, is NAPI with multi-queue network
adapters supposed to be polling on multiple CPUs,
but simply failing to do so in this case?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 16:19 ` Song Liu
2018-09-18 16:31 ` Rik van Riel
@ 2018-09-18 16:33 ` Eric Dumazet
2018-09-18 16:49 ` Song Liu
1 sibling, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2018-09-18 16:33 UTC (permalink / raw)
To: songliubraving
Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team
On Tue, Sep 18, 2018 at 9:19 AM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Sep 18, 2018, at 6:45 AM, Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote:
> >>
> >> We are debugging this issue that netconsole message triggers pegged softirq
> >> (ksoftirqd taking 100% CPU for many seconds). We found this issue in
> >> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily
> >> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more
> >> detail).
> >>
> >> After debugging for some time, we found that this issue is likely related
> >> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit,
> >> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting
> >> this commit also reduces the chances we hit the issue with bnxt (it still
> >> happens with a lower rate).
> >>
> >> I tried to fix this issue with relaxed variant (or older version) of
> >> napi_schedule_prep() in netpoll, just like the one on napi_watchdog().
> >> However, my tests do not always go as expected.
> >>
> >> Please share your comments/suggestions on which direction shall we try
> >> to fix this.
> >>
> >> Thanks in advance!
> >> Song
> >>
> >>
> >> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e=
> >
> > You have not traced ixgbe to understand why driver hits
> > "clean_complete=false" all the time ?
>
> The trace showed that we got "clean_complete=false" because
> ixgbe_clean_rx_irq() used all budget (64). It feels like the driver
> is tricked to process old data on the rx_ring for one more time.
Process old data ???? That would be quite an horrible bug !
Probably ASAN would help here, detecting use-after-free or things like that.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 16:33 ` Eric Dumazet
@ 2018-09-18 16:49 ` Song Liu
0 siblings, 0 replies; 22+ messages in thread
From: Song Liu @ 2018-09-18 16:49 UTC (permalink / raw)
To: Eric Dumazet
Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com,
Kernel Team
> On Sep 18, 2018, at 9:33 AM, Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Sep 18, 2018 at 9:19 AM Song Liu <songliubraving@fb.com> wrote:
>>
>>
>>
>>> On Sep 18, 2018, at 6:45 AM, Eric Dumazet <edumazet@google.com> wrote:
>>>
>>> On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote:
>>>>
>>>> We are debugging this issue that netconsole message triggers pegged softirq
>>>> (ksoftirqd taking 100% CPU for many seconds). We found this issue in
>>>> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily
>>>> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more
>>>> detail).
>>>>
>>>> After debugging for some time, we found that this issue is likely related
>>>> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit,
>>>> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting
>>>> this commit also reduces the chances we hit the issue with bnxt (it still
>>>> happens with a lower rate).
>>>>
>>>> I tried to fix this issue with relaxed variant (or older version) of
>>>> napi_schedule_prep() in netpoll, just like the one on napi_watchdog().
>>>> However, my tests do not always go as expected.
>>>>
>>>> Please share your comments/suggestions on which direction shall we try
>>>> to fix this.
>>>>
>>>> Thanks in advance!
>>>> Song
>>>>
>>>>
>>>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e=
>>>
>>> You have not traced ixgbe to understand why driver hits
>>> "clean_complete=false" all the time ?
>>
>> The trace showed that we got "clean_complete=false" because
>> ixgbe_clean_rx_irq() used all budget (64). It feels like the driver
>> is tricked to process old data on the rx_ring for one more time.
>
> Process old data ???? That would be quite an horrible bug !
>
> Probably ASAN would help here, detecting use-after-free or things like that.
I have tried KASAN, unfortunately, it doesn't yield any useful data. I think
it is not a use-after-free of some skb. It is more like bugs in the handling
of the rx/tx ring.
Thanks,
Song
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 13:45 ` Eric Dumazet
2018-09-18 16:19 ` Song Liu
@ 2018-09-18 17:51 ` Alexei Starovoitov
2018-09-18 18:17 ` Eric Dumazet
1 sibling, 1 reply; 22+ messages in thread
From: Alexei Starovoitov @ 2018-09-18 17:51 UTC (permalink / raw)
To: Eric Dumazet, Song Liu
Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com,
Kernel Team
On 9/18/18 6:45 AM, Eric Dumazet wrote:
> On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote:
>>
>> We are debugging this issue that netconsole message triggers pegged softirq
>> (ksoftirqd taking 100% CPU for many seconds). We found this issue in
>> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily
>> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more
>> detail).
>>
>> After debugging for some time, we found that this issue is likely related
>> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit,
>> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting
>> this commit also reduces the chances we hit the issue with bnxt (it still
>> happens with a lower rate).
>>
>> I tried to fix this issue with relaxed variant (or older version) of
>> napi_schedule_prep() in netpoll, just like the one on napi_watchdog().
>> However, my tests do not always go as expected.
>>
>> Please share your comments/suggestions on which direction shall we try
>> to fix this.
>>
>> Thanks in advance!
>> Song
>>
>>
>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e=
>
> You have not traced ixgbe to understand why driver hits
> "clean_complete=false" all the time ?
Eric,
I'm looking at commit 39e6c8208d7b and wondering that it's doing
clear_bit(NAPI_STATE_MISSED,..);
for busy_poll_stop(), but not for netpoll.
Can that be an issue?
and then something like below is needed:
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 57557a6a950c..a848be6b503c 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -172,6 +172,7 @@ static void poll_one_napi(struct napi_struct *napi)
trace_napi_poll(napi, work, 0);
clear_bit(NAPI_STATE_NPSVC, &napi->state);
+ clear_bit(NAPI_STATE_MISSED, &napi->state);
}
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 17:51 ` Alexei Starovoitov
@ 2018-09-18 18:17 ` Eric Dumazet
2018-09-18 20:37 ` Song Liu
0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2018-09-18 18:17 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: songliubraving, netdev, Jeff Kirsher, Alexander Duyck,
michael.chan, kernel-team
On Tue, Sep 18, 2018 at 10:51 AM Alexei Starovoitov <ast@fb.com> wrote:
>
> On 9/18/18 6:45 AM, Eric Dumazet wrote:
> > On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote:
> >>
> >> We are debugging this issue that netconsole message triggers pegged softirq
> >> (ksoftirqd taking 100% CPU for many seconds). We found this issue in
> >> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily
> >> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more
> >> detail).
> >>
> >> After debugging for some time, we found that this issue is likely related
> >> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit,
> >> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting
> >> this commit also reduces the chances we hit the issue with bnxt (it still
> >> happens with a lower rate).
> >>
> >> I tried to fix this issue with relaxed variant (or older version) of
> >> napi_schedule_prep() in netpoll, just like the one on napi_watchdog().
> >> However, my tests do not always go as expected.
> >>
> >> Please share your comments/suggestions on which direction shall we try
> >> to fix this.
> >>
> >> Thanks in advance!
> >> Song
> >>
> >>
> >> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e=
> >
> > You have not traced ixgbe to understand why driver hits
> > "clean_complete=false" all the time ?
>
> Eric,
>
> I'm looking at commit 39e6c8208d7b and wondering that it's doing
> clear_bit(NAPI_STATE_MISSED,..);
> for busy_poll_stop(), but not for netpoll.
> Can that be an issue?
>
> and then something like below is needed:
> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> index 57557a6a950c..a848be6b503c 100644
> --- a/net/core/netpoll.c
> +++ b/net/core/netpoll.c
> @@ -172,6 +172,7 @@ static void poll_one_napi(struct napi_struct *napi)
> trace_napi_poll(napi, work, 0);
>
> clear_bit(NAPI_STATE_NPSVC, &napi->state);
> + clear_bit(NAPI_STATE_MISSED, &napi->state);
> }
NAPI_STATE_MISSED should only be cleared under strict circumstances.
The clear in busy_poll_stop() is an optimization really (as explained
in the comment)
It is cleared when napi_complete_done() is eventually called, but if
ixgbe always handle 64 RX frames in its poll function,
napi_complete_done() will not be called. The bug is in ixgbe,
pretending its poll function should be called forever.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 18:17 ` Eric Dumazet
@ 2018-09-18 20:37 ` Song Liu
2018-09-18 21:13 ` Eric Dumazet
0 siblings, 1 reply; 22+ messages in thread
From: Song Liu @ 2018-09-18 20:37 UTC (permalink / raw)
To: Eric Dumazet
Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck,
michael.chan@broadcom.com, Kernel Team
> On Sep 18, 2018, at 11:17 AM, Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Sep 18, 2018 at 10:51 AM Alexei Starovoitov <ast@fb.com> wrote:
>>
>> On 9/18/18 6:45 AM, Eric Dumazet wrote:
>>> On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote:
>>>>
>>>> We are debugging this issue that netconsole message triggers pegged softirq
>>>> (ksoftirqd taking 100% CPU for many seconds). We found this issue in
>>>> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily
>>>> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more
>>>> detail).
>>>>
>>>> After debugging for some time, we found that this issue is likely related
>>>> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit,
>>>> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting
>>>> this commit also reduces the chances we hit the issue with bnxt (it still
>>>> happens with a lower rate).
>>>>
>>>> I tried to fix this issue with relaxed variant (or older version) of
>>>> napi_schedule_prep() in netpoll, just like the one on napi_watchdog().
>>>> However, my tests do not always go as expected.
>>>>
>>>> Please share your comments/suggestions on which direction shall we try
>>>> to fix this.
>>>>
>>>> Thanks in advance!
>>>> Song
>>>>
>>>>
>>>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e=
>>>
>>> You have not traced ixgbe to understand why driver hits
>>> "clean_complete=false" all the time ?
>>
>> Eric,
>>
>> I'm looking at commit 39e6c8208d7b and wondering that it's doing
>> clear_bit(NAPI_STATE_MISSED,..);
>> for busy_poll_stop(), but not for netpoll.
>> Can that be an issue?
>>
>> and then something like below is needed:
>> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
>> index 57557a6a950c..a848be6b503c 100644
>> --- a/net/core/netpoll.c
>> +++ b/net/core/netpoll.c
>> @@ -172,6 +172,7 @@ static void poll_one_napi(struct napi_struct *napi)
>> trace_napi_poll(napi, work, 0);
>>
>> clear_bit(NAPI_STATE_NPSVC, &napi->state);
>> + clear_bit(NAPI_STATE_MISSED, &napi->state);
>> }
>
>
> NAPI_STATE_MISSED should only be cleared under strict circumstances.
>
> The clear in busy_poll_stop() is an optimization really (as explained
> in the comment)
>
> It is cleared when napi_complete_done() is eventually called, but if
> ixgbe always handle 64 RX frames in its poll function,
> napi_complete_done() will not be called. The bug is in ixgbe,
> pretending its poll function should be called forever.
Looks like a patch like the following fixes the issue for ixgbe. But I
cannot explain it yet.
Does this ring a bell?
Thanks,
Song
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 787c84fb20dd..51611f799dae 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3059,11 +3059,14 @@ static irqreturn_t ixgbe_msix_other(int irq, void *data)
static irqreturn_t ixgbe_msix_clean_rings(int irq, void *data)
{
struct ixgbe_q_vector *q_vector = data;
+ struct napi_struct *napi = &q_vector->napi;
/* EIAM disabled interrupts (on this vector) for us */
- if (q_vector->rx.ring || q_vector->tx.ring)
- napi_schedule_irqoff(&q_vector->napi);
+ if ((q_vector->rx.ring || q_vector->tx.ring) &&
+ !napi_disable_pending(napi) &&
+ !test_and_set_bit(NAPI_STATE_SCHED, &napi->state))
+ __napi_schedule_irqoff(napi);
return IRQ_HANDLED;
}
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 20:37 ` Song Liu
@ 2018-09-18 21:13 ` Eric Dumazet
2018-09-18 21:21 ` Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 22+ messages in thread
From: Eric Dumazet @ 2018-09-18 21:13 UTC (permalink / raw)
To: songliubraving
Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck,
michael.chan, kernel-team
On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com> wrote:
>
> Looks like a patch like the following fixes the issue for ixgbe. But I
> cannot explain it yet.
>
> Does this ring a bell?
I dunno, it looks like the NIC is generating an interrupt while it should not,
and constantly sets NAPI_STATE_MISSED.
Or maybe we need to properly check napi_complete_done() return value.
diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
index d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedcdcb72dc1c401813
100644
--- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
+++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
@@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int budget)
ixgb_clean_rx_irq(adapter, &work_done, budget);
/* If budget not fully consumed, exit the polling mode */
- if (work_done < budget) {
- napi_complete_done(napi, work_done);
+ if (work_done < budget &&
+ napi_complete_done(napi, work_done)) {
if (!test_bit(__IXGB_DOWN, &adapter->flags))
ixgb_irq_enable(adapter);
}
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:13 ` Eric Dumazet
@ 2018-09-18 21:21 ` Eric Dumazet
2018-09-18 21:36 ` Jeff Kirsher
2018-09-18 21:21 ` Song Liu
2018-09-18 21:25 ` Florian Fainelli
2 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2018-09-18 21:21 UTC (permalink / raw)
To: Eric Dumazet, songliubraving
Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck,
michael.chan, kernel-team
On 09/18/2018 02:13 PM, Eric Dumazet wrote:
> On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com> wrote:
>>
>
>> Looks like a patch like the following fixes the issue for ixgbe. But I
>> cannot explain it yet.
>>
>> Does this ring a bell?
>
> I dunno, it looks like the NIC is generating an interrupt while it should not,
> and constantly sets NAPI_STATE_MISSED.
>
> Or maybe we need to properly check napi_complete_done() return value.
>
> diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> index d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedcdcb72dc1c401813
> 100644
> --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int budget)
> ixgb_clean_rx_irq(adapter, &work_done, budget);
>
> /* If budget not fully consumed, exit the polling mode */
> - if (work_done < budget) {
> - napi_complete_done(napi, work_done);
> + if (work_done < budget &&
> + napi_complete_done(napi, work_done)) {
> if (!test_bit(__IXGB_DOWN, &adapter->flags))
> ixgb_irq_enable(adapter);
> }
>
ixgbe patch would be :
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 604282f03d236e4358fc91e64d8ba00a9b36cb8c..80d00aecb6e3e3e950ce6309bfe3639953dd73d9 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3196,12 +3196,12 @@ int ixgbe_poll(struct napi_struct *napi, int budget)
return budget;
/* all work done, exit the polling mode */
- napi_complete_done(napi, work_done);
- if (adapter->rx_itr_setting & 1)
- ixgbe_set_itr(q_vector);
- if (!test_bit(__IXGBE_DOWN, &adapter->state))
- ixgbe_irq_enable_queues(adapter, BIT_ULL(q_vector->v_idx));
-
+ if (napi_complete_done(napi, work_done)) {
+ if (adapter->rx_itr_setting & 1)
+ ixgbe_set_itr(q_vector);
+ if (!test_bit(__IXGBE_DOWN, &adapter->state))
+ ixgbe_irq_enable_queues(adapter, BIT_ULL(q_vector->v_idx));
+ }
return min(work_done, budget - 1);
}
^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:13 ` Eric Dumazet
2018-09-18 21:21 ` Eric Dumazet
@ 2018-09-18 21:21 ` Song Liu
2018-09-18 21:25 ` Eric Dumazet
2018-09-18 21:25 ` Florian Fainelli
2 siblings, 1 reply; 22+ messages in thread
From: Song Liu @ 2018-09-18 21:21 UTC (permalink / raw)
To: Eric Dumazet
Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck,
michael.chan@broadcom.com, Kernel Team
> On Sep 18, 2018, at 2:13 PM, Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com> wrote:
>>
>
>> Looks like a patch like the following fixes the issue for ixgbe. But I
>> cannot explain it yet.
>>
>> Does this ring a bell?
>
> I dunno, it looks like the NIC is generating an interrupt while it should not,
> and constantly sets NAPI_STATE_MISSED.
>
> Or maybe we need to properly check napi_complete_done() return value.
>
> diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> index d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedcdcb72dc1c401813
> 100644
> --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int budget)
> ixgb_clean_rx_irq(adapter, &work_done, budget);
>
> /* If budget not fully consumed, exit the polling mode */
> - if (work_done < budget) {
> - napi_complete_done(napi, work_done);
> + if (work_done < budget &&
> + napi_complete_done(napi, work_done)) {
> if (!test_bit(__IXGB_DOWN, &adapter->flags))
> ixgb_irq_enable(adapter);
> }
Thanks Eric!
I was looking at exactly this part. :) And it seems working!
I will run a bigger test and update shortly.
Best,
Song
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:21 ` Song Liu
@ 2018-09-18 21:25 ` Eric Dumazet
0 siblings, 0 replies; 22+ messages in thread
From: Eric Dumazet @ 2018-09-18 21:25 UTC (permalink / raw)
To: songliubraving
Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck,
michael.chan, kernel-team
On Tue, Sep 18, 2018 at 2:22 PM Song Liu <songliubraving@fb.com> wrote:
>
> Thanks Eric!
>
> I was looking at exactly this part. :) And it seems working!
And this should also make busy polling a bit faster as well, avoiding
enabling/receiving interrupts in the busy loop.
>
> I will run a bigger test and update shortly.
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:13 ` Eric Dumazet
2018-09-18 21:21 ` Eric Dumazet
2018-09-18 21:21 ` Song Liu
@ 2018-09-18 21:25 ` Florian Fainelli
2018-09-18 21:28 ` Eric Dumazet
2 siblings, 1 reply; 22+ messages in thread
From: Florian Fainelli @ 2018-09-18 21:25 UTC (permalink / raw)
To: Eric Dumazet, songliubraving
Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck,
michael.chan, kernel-team
On 09/18/2018 02:13 PM, Eric Dumazet wrote:
> On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com> wrote:
>>
>
>> Looks like a patch like the following fixes the issue for ixgbe. But I
>> cannot explain it yet.
>>
>> Does this ring a bell?
>
> I dunno, it looks like the NIC is generating an interrupt while it should not,
> and constantly sets NAPI_STATE_MISSED.
>
> Or maybe we need to properly check napi_complete_done() return value.
>
> diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> index d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedcdcb72dc1c401813
> 100644
> --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int budget)
> ixgb_clean_rx_irq(adapter, &work_done, budget);
>
> /* If budget not fully consumed, exit the polling mode */
> - if (work_done < budget) {
> - napi_complete_done(napi, work_done);
> + if (work_done < budget &&
> + napi_complete_done(napi, work_done)) {
> if (!test_bit(__IXGB_DOWN, &adapter->flags))
> ixgb_irq_enable(adapter);
> }
This would not be the only driver doing this unfortunately... should we
add a __must_check annotation to help catch those (mis)uses? Though that
could cause false positives for drivers using NAPI to clean their TX ring.
--
Florian
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:25 ` Florian Fainelli
@ 2018-09-18 21:28 ` Eric Dumazet
2018-09-18 21:35 ` Florian Fainelli
2018-09-18 21:36 ` Song Liu
0 siblings, 2 replies; 22+ messages in thread
From: Eric Dumazet @ 2018-09-18 21:28 UTC (permalink / raw)
To: Florian Fainelli
Cc: songliubraving, Alexei Starovoitov, netdev, Jeff Kirsher,
Alexander Duyck, michael.chan, kernel-team
On Tue, Sep 18, 2018 at 2:25 PM Florian Fainelli <f.fainelli@gmail.com> wrote:
>
>
> This would not be the only driver doing this unfortunately... should we
> add a __must_check annotation to help catch those (mis)uses? Though that
> could cause false positives for drivers using NAPI to clean their TX ring.
>
Well, before adding __must_check we would need to cook a (big) patch
series to change all occurrences.
Not clear why netpoll is the trigger ?
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:28 ` Eric Dumazet
@ 2018-09-18 21:35 ` Florian Fainelli
2018-09-18 21:36 ` Song Liu
1 sibling, 0 replies; 22+ messages in thread
From: Florian Fainelli @ 2018-09-18 21:35 UTC (permalink / raw)
To: Eric Dumazet
Cc: songliubraving, Alexei Starovoitov, netdev, Jeff Kirsher,
Alexander Duyck, michael.chan, kernel-team
On 09/18/2018 02:28 PM, Eric Dumazet wrote:
> On Tue, Sep 18, 2018 at 2:25 PM Florian Fainelli <f.fainelli@gmail.com> wrote:
>>
>>
>> This would not be the only driver doing this unfortunately... should we
>> add a __must_check annotation to help catch those (mis)uses? Though that
>> could cause false positives for drivers using NAPI to clean their TX ring.
>>
>
> Well, before adding __must_check we would need to cook a (big) patch
> series to change all occurrences.
>
Sounds good, I don't mind submitting something unless you beat me to it.
>
> Not clear why netpoll is the trigger ?
>
Me either, this should be observable during normal operation as well,
though most likely you just get a spurious interrupt and nothing to
clean in the RX ring, so things just get unnoticed?
--
Florian
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:28 ` Eric Dumazet
2018-09-18 21:35 ` Florian Fainelli
@ 2018-09-18 21:36 ` Song Liu
1 sibling, 0 replies; 22+ messages in thread
From: Song Liu @ 2018-09-18 21:36 UTC (permalink / raw)
To: Eric Dumazet
Cc: Florian Fainelli, Alexei Starovoitov, netdev, Jeff Kirsher,
Alexander Duyck, michael.chan@broadcom.com, Kernel Team
> On Sep 18, 2018, at 2:28 PM, Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Sep 18, 2018 at 2:25 PM Florian Fainelli <f.fainelli@gmail.com> wrote:
>>
>>
>> This would not be the only driver doing this unfortunately... should we
>> add a __must_check annotation to help catch those (mis)uses? Though that
>> could cause false positives for drivers using NAPI to clean their TX ring.
>>
>
> Well, before adding __must_check we would need to cook a (big) patch
> series to change all occurrences.
>
>
> Not clear why netpoll is the trigger ?
>From my observation, the trigger path is:
netpoll_poll_dev() => ndo_poll_controller() => napi_schedule_prep().
I guess it is a race between IRQ/netpoll=>napi_schedule_prep() and
not handled napi_complete_done(). netpoll just makes it triggers more
often?
Unfortunately for me, bnxt already checked napi_complete_done(), so
there should be another bug with bnxt (at least with our 4.11 based
kernel).
Song
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:21 ` Eric Dumazet
@ 2018-09-18 21:36 ` Jeff Kirsher
2018-09-18 21:40 ` Song Liu
0 siblings, 1 reply; 22+ messages in thread
From: Jeff Kirsher @ 2018-09-18 21:36 UTC (permalink / raw)
To: Eric Dumazet, Eric Dumazet, songliubraving
Cc: Alexei Starovoitov, netdev, Alexander Duyck, michael.chan,
kernel-team
[-- Attachment #1: Type: text/plain, Size: 2907 bytes --]
On Tue, 2018-09-18 at 14:21 -0700, Eric Dumazet wrote:
>
> On 09/18/2018 02:13 PM, Eric Dumazet wrote:
> > On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com>
> > wrote:
> > >
> > > Looks like a patch like the following fixes the issue for ixgbe.
> > > But I
> > > cannot explain it yet.
> > >
> > > Does this ring a bell?
> >
> > I dunno, it looks like the NIC is generating an interrupt while it
> > should not,
> > and constantly sets NAPI_STATE_MISSED.
> >
> > Or maybe we need to properly check napi_complete_done() return
> > value.
> >
> > diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> > b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> > index
> > d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedc
> > dcb72dc1c401813
> > 100644
> > --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> > +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
> > @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int
> > budget)
> > ixgb_clean_rx_irq(adapter, &work_done, budget);
> >
> > /* If budget not fully consumed, exit the polling mode */
> > - if (work_done < budget) {
> > - napi_complete_done(napi, work_done);
> > + if (work_done < budget &&
> > + napi_complete_done(napi, work_done)) {
> > if (!test_bit(__IXGB_DOWN, &adapter->flags))
> > ixgb_irq_enable(adapter);
> > }
> >
>
>
> ixgbe patch would be :
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index
> 604282f03d236e4358fc91e64d8ba00a9b36cb8c..80d00aecb6e3e3e950ce6309bfe
> 3639953dd73d9 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -3196,12 +3196,12 @@ int ixgbe_poll(struct napi_struct *napi, int
> budget)
> return budget;
>
> /* all work done, exit the polling mode */
> - napi_complete_done(napi, work_done);
> - if (adapter->rx_itr_setting & 1)
> - ixgbe_set_itr(q_vector);
> - if (!test_bit(__IXGBE_DOWN, &adapter->state))
> - ixgbe_irq_enable_queues(adapter, BIT_ULL(q_vector-
> >v_idx));
> -
> + if (napi_complete_done(napi, work_done)) {
> + if (adapter->rx_itr_setting & 1)
> + ixgbe_set_itr(q_vector);
> + if (!test_bit(__IXGBE_DOWN, &adapter->state))
> + ixgbe_irq_enable_queues(adapter,
> BIT_ULL(q_vector->v_idx));
> + }
> return min(work_done, budget - 1);
> }
>
Eric, after Song does some testing on these changes, will you be
submitting a formal patch? If so, make sure to send it to
intel-wired-lan@lists.osuosl.org mailing list so I can pick up the fix.
By the way, thanks!
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:36 ` Jeff Kirsher
@ 2018-09-18 21:40 ` Song Liu
2018-09-18 21:46 ` Eric Dumazet
0 siblings, 1 reply; 22+ messages in thread
From: Song Liu @ 2018-09-18 21:40 UTC (permalink / raw)
To: jeffrey.t.kirsher@intel.com
Cc: Eric Dumazet, Eric Dumazet, Alexei Starovoitov, netdev,
Alexander Duyck, michael.chan@broadcom.com, Kernel Team
> On Sep 18, 2018, at 2:36 PM, Jeff Kirsher <jeffrey.t.kirsher@intel.com> wrote:
>
> On Tue, 2018-09-18 at 14:21 -0700, Eric Dumazet wrote:
>>
>> On 09/18/2018 02:13 PM, Eric Dumazet wrote:
>>> On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com>
>>> wrote:
>>>>
>>>> Looks like a patch like the following fixes the issue for ixgbe.
>>>> But I
>>>> cannot explain it yet.
>>>>
>>>> Does this ring a bell?
>>>
>>> I dunno, it looks like the NIC is generating an interrupt while it
>>> should not,
>>> and constantly sets NAPI_STATE_MISSED.
>>>
>>> Or maybe we need to properly check napi_complete_done() return
>>> value.
>>>
>>> diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
>>> b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
>>> index
>>> d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedc
>>> dcb72dc1c401813
>>> 100644
>>> --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
>>> +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
>>> @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int
>>> budget)
>>> ixgb_clean_rx_irq(adapter, &work_done, budget);
>>>
>>> /* If budget not fully consumed, exit the polling mode */
>>> - if (work_done < budget) {
>>> - napi_complete_done(napi, work_done);
>>> + if (work_done < budget &&
>>> + napi_complete_done(napi, work_done)) {
>>> if (!test_bit(__IXGB_DOWN, &adapter->flags))
>>> ixgb_irq_enable(adapter);
>>> }
>>>
>>
>>
>> ixgbe patch would be :
>>
>> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> index
>> 604282f03d236e4358fc91e64d8ba00a9b36cb8c..80d00aecb6e3e3e950ce6309bfe
>> 3639953dd73d9 100644
>> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
>> @@ -3196,12 +3196,12 @@ int ixgbe_poll(struct napi_struct *napi, int
>> budget)
>> return budget;
>>
>> /* all work done, exit the polling mode */
>> - napi_complete_done(napi, work_done);
>> - if (adapter->rx_itr_setting & 1)
>> - ixgbe_set_itr(q_vector);
>> - if (!test_bit(__IXGBE_DOWN, &adapter->state))
>> - ixgbe_irq_enable_queues(adapter, BIT_ULL(q_vector-
>>> v_idx));
>> -
>> + if (napi_complete_done(napi, work_done)) {
Shall we add likely() here?
>> + if (adapter->rx_itr_setting & 1)
>> + ixgbe_set_itr(q_vector);
>> + if (!test_bit(__IXGBE_DOWN, &adapter->state))
>> + ixgbe_irq_enable_queues(adapter,
>> BIT_ULL(q_vector->v_idx));
>> + }
>> return min(work_done, budget - 1);
>> }
>>
>
> Eric, after Song does some testing on these changes, will you be
> submitting a formal patch? If so, make sure to send it to
> intel-wired-lan@lists.osuosl.org mailing list so I can pick up the fix.
>
> By the way, thanks!
I would submit the patch if Eric prefer not to. :)
Thanks,
Song
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:40 ` Song Liu
@ 2018-09-18 21:46 ` Eric Dumazet
2018-09-18 21:55 ` Song Liu
0 siblings, 1 reply; 22+ messages in thread
From: Eric Dumazet @ 2018-09-18 21:46 UTC (permalink / raw)
To: songliubraving
Cc: Jeff Kirsher, Eric Dumazet, Alexei Starovoitov, netdev,
Alexander Duyck, michael.chan, kernel-team
On Tue, Sep 18, 2018 at 2:41 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> >
> I would submit the patch if Eric prefer not to. :)
Hmmm.... maybe the bug is really in ixgbe_netpoll()
This whole ndo_poll_controller() stuff is crazy.
All sane implementations should only call napi_schedule()
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:46 ` Eric Dumazet
@ 2018-09-18 21:55 ` Song Liu
2018-09-18 22:04 ` Eric Dumazet
0 siblings, 1 reply; 22+ messages in thread
From: Song Liu @ 2018-09-18 21:55 UTC (permalink / raw)
To: Eric Dumazet
Cc: Jeff Kirsher, Eric Dumazet, Alexei Starovoitov, netdev,
Alexander Duyck, michael.chan@broadcom.com, Kernel Team
> On Sep 18, 2018, at 2:46 PM, Eric Dumazet <edumazet@google.com> wrote:
>
> On Tue, Sep 18, 2018 at 2:41 PM Song Liu <songliubraving@fb.com> wrote:
>>
>>
>>
>>>
>> I would submit the patch if Eric prefer not to. :)
>
>
> Hmmm.... maybe the bug is really in ixgbe_netpoll()
>
> This whole ndo_poll_controller() stuff is crazy.
>
> All sane implementations should only call napi_schedule()
Current implement is about identical to napi_schedule_irqoff(). Do
we really need napi_schedule() instead?
On the other hand, I think we should check napi_complete_done() in
ixgbe_poll() anyway.
Song
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?)
2018-09-18 21:55 ` Song Liu
@ 2018-09-18 22:04 ` Eric Dumazet
0 siblings, 0 replies; 22+ messages in thread
From: Eric Dumazet @ 2018-09-18 22:04 UTC (permalink / raw)
To: songliubraving
Cc: Jeff Kirsher, Eric Dumazet, Alexei Starovoitov, netdev,
Alexander Duyck, michael.chan, kernel-team
On Tue, Sep 18, 2018 at 2:56 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Sep 18, 2018, at 2:46 PM, Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Tue, Sep 18, 2018 at 2:41 PM Song Liu <songliubraving@fb.com> wrote:
> >>
> >>
> >>
> >>>
> >> I would submit the patch if Eric prefer not to. :)
> >
> >
> > Hmmm.... maybe the bug is really in ixgbe_netpoll()
> >
> > This whole ndo_poll_controller() stuff is crazy.
> >
> > All sane implementations should only call napi_schedule()
>
> Current implement is about identical to napi_schedule_irqoff(). Do
> we really need napi_schedule() instead?
>
> On the other hand, I think we should check napi_complete_done() in
> ixgbe_poll() anyway.
It seems the netpoll code is racy, since another cpu might be calling
ixgbe poll(),
and return early from napi_complete_done() :
if (unlikely(n->state & (NAPIF_STATE_NPSVC |
NAPIF_STATE_IN_BUSY_POLL)))
return false;
This is why netpoll enabled drivers _must_ check the
napi_complete[_done]() return value,
otherwise they might re-enable IRQs why they should not.
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2018-09-19 3:39 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-18 8:41 pegged softirq and NAPI race (?) Song Liu
2018-09-18 13:45 ` Eric Dumazet
2018-09-18 16:19 ` Song Liu
2018-09-18 16:31 ` Rik van Riel
2018-09-18 16:33 ` Eric Dumazet
2018-09-18 16:49 ` Song Liu
2018-09-18 17:51 ` Alexei Starovoitov
2018-09-18 18:17 ` Eric Dumazet
2018-09-18 20:37 ` Song Liu
2018-09-18 21:13 ` Eric Dumazet
2018-09-18 21:21 ` Eric Dumazet
2018-09-18 21:36 ` Jeff Kirsher
2018-09-18 21:40 ` Song Liu
2018-09-18 21:46 ` Eric Dumazet
2018-09-18 21:55 ` Song Liu
2018-09-18 22:04 ` Eric Dumazet
2018-09-18 21:21 ` Song Liu
2018-09-18 21:25 ` Eric Dumazet
2018-09-18 21:25 ` Florian Fainelli
2018-09-18 21:28 ` Eric Dumazet
2018-09-18 21:35 ` Florian Fainelli
2018-09-18 21:36 ` Song Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).