* pegged softirq and NAPI race (?)
@ 2018-09-18 8:41 Song Liu
2018-09-18 13:45 ` Eric Dumazet
0 siblings, 1 reply; 22+ messages in thread
From: Song Liu @ 2018-09-18 8:41 UTC (permalink / raw)
To: Networking
Cc: edumazet@google.com, jeffrey.t.kirsher@intel.com,
alexander.h.duyck@intel.com, Michael Chan, Kernel Team
We are debugging this issue that netconsole message triggers pegged softirq
(ksoftirqd taking 100% CPU for many seconds). We found this issue in
production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily
reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more
detail).
After debugging for some time, we found that this issue is likely related
to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit,
the steps described in [1] cannot reproduce the issue on ixgbe. Reverting
this commit also reduces the chances we hit the issue with bnxt (it still
happens with a lower rate).
I tried to fix this issue with relaxed variant (or older version) of
napi_schedule_prep() in netpoll, just like the one on napi_watchdog().
However, my tests do not always go as expected.
Please share your comments/suggestions on which direction shall we try
to fix this.
Thanks in advance!
Song
[1] https://www.spinics.net/lists/netdev/msg522328.html
^ permalink raw reply [flat|nested] 22+ messages in thread* Re: pegged softirq and NAPI race (?) 2018-09-18 8:41 pegged softirq and NAPI race (?) Song Liu @ 2018-09-18 13:45 ` Eric Dumazet 2018-09-18 16:19 ` Song Liu 2018-09-18 17:51 ` Alexei Starovoitov 0 siblings, 2 replies; 22+ messages in thread From: Eric Dumazet @ 2018-09-18 13:45 UTC (permalink / raw) To: songliubraving Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote: > > We are debugging this issue that netconsole message triggers pegged softirq > (ksoftirqd taking 100% CPU for many seconds). We found this issue in > production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily > reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more > detail). > > After debugging for some time, we found that this issue is likely related > to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit, > the steps described in [1] cannot reproduce the issue on ixgbe. Reverting > this commit also reduces the chances we hit the issue with bnxt (it still > happens with a lower rate). > > I tried to fix this issue with relaxed variant (or older version) of > napi_schedule_prep() in netpoll, just like the one on napi_watchdog(). > However, my tests do not always go as expected. > > Please share your comments/suggestions on which direction shall we try > to fix this. > > Thanks in advance! > Song > > > [1] https://www.spinics.net/lists/netdev/msg522328.html You have not traced ixgbe to understand why driver hits "clean_complete=false" all the time ? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 13:45 ` Eric Dumazet @ 2018-09-18 16:19 ` Song Liu 2018-09-18 16:31 ` Rik van Riel 2018-09-18 16:33 ` Eric Dumazet 2018-09-18 17:51 ` Alexei Starovoitov 1 sibling, 2 replies; 22+ messages in thread From: Song Liu @ 2018-09-18 16:19 UTC (permalink / raw) To: Eric Dumazet Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com, Kernel Team > On Sep 18, 2018, at 6:45 AM, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote: >> >> We are debugging this issue that netconsole message triggers pegged softirq >> (ksoftirqd taking 100% CPU for many seconds). We found this issue in >> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily >> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more >> detail). >> >> After debugging for some time, we found that this issue is likely related >> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit, >> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting >> this commit also reduces the chances we hit the issue with bnxt (it still >> happens with a lower rate). >> >> I tried to fix this issue with relaxed variant (or older version) of >> napi_schedule_prep() in netpoll, just like the one on napi_watchdog(). >> However, my tests do not always go as expected. >> >> Please share your comments/suggestions on which direction shall we try >> to fix this. >> >> Thanks in advance! >> Song >> >> >> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e= > > You have not traced ixgbe to understand why driver hits > "clean_complete=false" all the time ? The trace showed that we got "clean_complete=false" because ixgbe_clean_rx_irq() used all budget (64). It feels like the driver is tricked to process old data on the rx_ring for one more time. Have you seen similar issue? Thanks, Song ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 16:19 ` Song Liu @ 2018-09-18 16:31 ` Rik van Riel 2018-09-18 16:33 ` Eric Dumazet 1 sibling, 0 replies; 22+ messages in thread From: Rik van Riel @ 2018-09-18 16:31 UTC (permalink / raw) To: Song Liu, Eric Dumazet Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com, Kernel Team On Tue, 2018-09-18 at 12:19 -0400, Song Liu wrote: > > On Sep 18, 2018, at 6:45 AM, Eric Dumazet <edumazet@google.com> > > wrote: > > > > On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> > > wrote: > > > > > > We are debugging this issue that netconsole message triggers > > > pegged softirq > > > (ksoftirqd taking 100% CPU for many seconds). We found this issue > > > in > > > production with both bnxt and ixgbe, on a 4.11 based kernel. This > > > is easily > > > reproducible with ixgbe on 4.11, and latest net/net-next (see [1] > > > for more > > > detail). > > > > > > After debugging for some time, we found that this issue is likely > > > related > > > to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this > > > commit, > > > the steps described in [1] cannot reproduce the issue on ixgbe. > > > Reverting > > > this commit also reduces the chances we hit the issue with bnxt > > > (it still > > > happens with a lower rate). > > > > > > I tried to fix this issue with relaxed variant (or older version) > > > of > > > napi_schedule_prep() in netpoll, just like the one on > > > napi_watchdog(). > > > However, my tests do not always go as expected. > > > > > > Please share your comments/suggestions on which direction shall > > > we try > > > to fix this. > > > > > > Thanks in advance! > > > Song > > > > > > > > > [1] > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e= > > > > You have not traced ixgbe to understand why driver hits > > "clean_complete=false" all the time ? > > The trace showed that we got "clean_complete=false" because > ixgbe_clean_rx_irq() used all budget (64). It feels like the driver > is tricked to process old data on the rx_ring for one more time. > > Have you seen similar issue? A quick reading of the code suggests that means polling cannot keep up with the rate of incoming packets. That should not be a surprise, given that polling appears to happen on just one CPU, while interrupt driven packet delivery was fanned out across a larger number of CPUs. Does the NAPI code have any way in which it periodically force-returns to IRQ mode, because multiple CPUs in IRQ mode can keep up with packets better than a single CPU in polling mode? Alternatively, is NAPI with multi-queue network adapters supposed to be polling on multiple CPUs, but simply failing to do so in this case? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 16:19 ` Song Liu 2018-09-18 16:31 ` Rik van Riel @ 2018-09-18 16:33 ` Eric Dumazet 2018-09-18 16:49 ` Song Liu 1 sibling, 1 reply; 22+ messages in thread From: Eric Dumazet @ 2018-09-18 16:33 UTC (permalink / raw) To: songliubraving Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team On Tue, Sep 18, 2018 at 9:19 AM Song Liu <songliubraving@fb.com> wrote: > > > > > On Sep 18, 2018, at 6:45 AM, Eric Dumazet <edumazet@google.com> wrote: > > > > On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote: > >> > >> We are debugging this issue that netconsole message triggers pegged softirq > >> (ksoftirqd taking 100% CPU for many seconds). We found this issue in > >> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily > >> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more > >> detail). > >> > >> After debugging for some time, we found that this issue is likely related > >> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit, > >> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting > >> this commit also reduces the chances we hit the issue with bnxt (it still > >> happens with a lower rate). > >> > >> I tried to fix this issue with relaxed variant (or older version) of > >> napi_schedule_prep() in netpoll, just like the one on napi_watchdog(). > >> However, my tests do not always go as expected. > >> > >> Please share your comments/suggestions on which direction shall we try > >> to fix this. > >> > >> Thanks in advance! > >> Song > >> > >> > >> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e= > > > > You have not traced ixgbe to understand why driver hits > > "clean_complete=false" all the time ? > > The trace showed that we got "clean_complete=false" because > ixgbe_clean_rx_irq() used all budget (64). It feels like the driver > is tricked to process old data on the rx_ring for one more time. Process old data ???? That would be quite an horrible bug ! Probably ASAN would help here, detecting use-after-free or things like that. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 16:33 ` Eric Dumazet @ 2018-09-18 16:49 ` Song Liu 0 siblings, 0 replies; 22+ messages in thread From: Song Liu @ 2018-09-18 16:49 UTC (permalink / raw) To: Eric Dumazet Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com, Kernel Team > On Sep 18, 2018, at 9:33 AM, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, Sep 18, 2018 at 9:19 AM Song Liu <songliubraving@fb.com> wrote: >> >> >> >>> On Sep 18, 2018, at 6:45 AM, Eric Dumazet <edumazet@google.com> wrote: >>> >>> On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote: >>>> >>>> We are debugging this issue that netconsole message triggers pegged softirq >>>> (ksoftirqd taking 100% CPU for many seconds). We found this issue in >>>> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily >>>> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more >>>> detail). >>>> >>>> After debugging for some time, we found that this issue is likely related >>>> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit, >>>> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting >>>> this commit also reduces the chances we hit the issue with bnxt (it still >>>> happens with a lower rate). >>>> >>>> I tried to fix this issue with relaxed variant (or older version) of >>>> napi_schedule_prep() in netpoll, just like the one on napi_watchdog(). >>>> However, my tests do not always go as expected. >>>> >>>> Please share your comments/suggestions on which direction shall we try >>>> to fix this. >>>> >>>> Thanks in advance! >>>> Song >>>> >>>> >>>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e= >>> >>> You have not traced ixgbe to understand why driver hits >>> "clean_complete=false" all the time ? >> >> The trace showed that we got "clean_complete=false" because >> ixgbe_clean_rx_irq() used all budget (64). It feels like the driver >> is tricked to process old data on the rx_ring for one more time. > > Process old data ???? That would be quite an horrible bug ! > > Probably ASAN would help here, detecting use-after-free or things like that. I have tried KASAN, unfortunately, it doesn't yield any useful data. I think it is not a use-after-free of some skb. It is more like bugs in the handling of the rx/tx ring. Thanks, Song ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 13:45 ` Eric Dumazet 2018-09-18 16:19 ` Song Liu @ 2018-09-18 17:51 ` Alexei Starovoitov 2018-09-18 18:17 ` Eric Dumazet 1 sibling, 1 reply; 22+ messages in thread From: Alexei Starovoitov @ 2018-09-18 17:51 UTC (permalink / raw) To: Eric Dumazet, Song Liu Cc: netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com, Kernel Team On 9/18/18 6:45 AM, Eric Dumazet wrote: > On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote: >> >> We are debugging this issue that netconsole message triggers pegged softirq >> (ksoftirqd taking 100% CPU for many seconds). We found this issue in >> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily >> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more >> detail). >> >> After debugging for some time, we found that this issue is likely related >> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit, >> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting >> this commit also reduces the chances we hit the issue with bnxt (it still >> happens with a lower rate). >> >> I tried to fix this issue with relaxed variant (or older version) of >> napi_schedule_prep() in netpoll, just like the one on napi_watchdog(). >> However, my tests do not always go as expected. >> >> Please share your comments/suggestions on which direction shall we try >> to fix this. >> >> Thanks in advance! >> Song >> >> >> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e= > > You have not traced ixgbe to understand why driver hits > "clean_complete=false" all the time ? Eric, I'm looking at commit 39e6c8208d7b and wondering that it's doing clear_bit(NAPI_STATE_MISSED,..); for busy_poll_stop(), but not for netpoll. Can that be an issue? and then something like below is needed: diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 57557a6a950c..a848be6b503c 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -172,6 +172,7 @@ static void poll_one_napi(struct napi_struct *napi) trace_napi_poll(napi, work, 0); clear_bit(NAPI_STATE_NPSVC, &napi->state); + clear_bit(NAPI_STATE_MISSED, &napi->state); } ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 17:51 ` Alexei Starovoitov @ 2018-09-18 18:17 ` Eric Dumazet 2018-09-18 20:37 ` Song Liu 0 siblings, 1 reply; 22+ messages in thread From: Eric Dumazet @ 2018-09-18 18:17 UTC (permalink / raw) To: Alexei Starovoitov Cc: songliubraving, netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team On Tue, Sep 18, 2018 at 10:51 AM Alexei Starovoitov <ast@fb.com> wrote: > > On 9/18/18 6:45 AM, Eric Dumazet wrote: > > On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote: > >> > >> We are debugging this issue that netconsole message triggers pegged softirq > >> (ksoftirqd taking 100% CPU for many seconds). We found this issue in > >> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily > >> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more > >> detail). > >> > >> After debugging for some time, we found that this issue is likely related > >> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit, > >> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting > >> this commit also reduces the chances we hit the issue with bnxt (it still > >> happens with a lower rate). > >> > >> I tried to fix this issue with relaxed variant (or older version) of > >> napi_schedule_prep() in netpoll, just like the one on napi_watchdog(). > >> However, my tests do not always go as expected. > >> > >> Please share your comments/suggestions on which direction shall we try > >> to fix this. > >> > >> Thanks in advance! > >> Song > >> > >> > >> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e= > > > > You have not traced ixgbe to understand why driver hits > > "clean_complete=false" all the time ? > > Eric, > > I'm looking at commit 39e6c8208d7b and wondering that it's doing > clear_bit(NAPI_STATE_MISSED,..); > for busy_poll_stop(), but not for netpoll. > Can that be an issue? > > and then something like below is needed: > diff --git a/net/core/netpoll.c b/net/core/netpoll.c > index 57557a6a950c..a848be6b503c 100644 > --- a/net/core/netpoll.c > +++ b/net/core/netpoll.c > @@ -172,6 +172,7 @@ static void poll_one_napi(struct napi_struct *napi) > trace_napi_poll(napi, work, 0); > > clear_bit(NAPI_STATE_NPSVC, &napi->state); > + clear_bit(NAPI_STATE_MISSED, &napi->state); > } NAPI_STATE_MISSED should only be cleared under strict circumstances. The clear in busy_poll_stop() is an optimization really (as explained in the comment) It is cleared when napi_complete_done() is eventually called, but if ixgbe always handle 64 RX frames in its poll function, napi_complete_done() will not be called. The bug is in ixgbe, pretending its poll function should be called forever. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 18:17 ` Eric Dumazet @ 2018-09-18 20:37 ` Song Liu 2018-09-18 21:13 ` Eric Dumazet 0 siblings, 1 reply; 22+ messages in thread From: Song Liu @ 2018-09-18 20:37 UTC (permalink / raw) To: Eric Dumazet Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com, Kernel Team > On Sep 18, 2018, at 11:17 AM, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, Sep 18, 2018 at 10:51 AM Alexei Starovoitov <ast@fb.com> wrote: >> >> On 9/18/18 6:45 AM, Eric Dumazet wrote: >>> On Tue, Sep 18, 2018 at 1:41 AM Song Liu <songliubraving@fb.com> wrote: >>>> >>>> We are debugging this issue that netconsole message triggers pegged softirq >>>> (ksoftirqd taking 100% CPU for many seconds). We found this issue in >>>> production with both bnxt and ixgbe, on a 4.11 based kernel. This is easily >>>> reproducible with ixgbe on 4.11, and latest net/net-next (see [1] for more >>>> detail). >>>> >>>> After debugging for some time, we found that this issue is likely related >>>> to 39e6c8208d7b ("net: solve a NAPI race"). After reverting this commit, >>>> the steps described in [1] cannot reproduce the issue on ixgbe. Reverting >>>> this commit also reduces the chances we hit the issue with bnxt (it still >>>> happens with a lower rate). >>>> >>>> I tried to fix this issue with relaxed variant (or older version) of >>>> napi_schedule_prep() in netpoll, just like the one on napi_watchdog(). >>>> However, my tests do not always go as expected. >>>> >>>> Please share your comments/suggestions on which direction shall we try >>>> to fix this. >>>> >>>> Thanks in advance! >>>> Song >>>> >>>> >>>> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg522328.html&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=i6WobKxbeG3slzHSIOxTVtYIJw7qjCE6S0spDTKL-J4&m=iSaOapj1kxjhGYLgQr0Qd8mQCzVdobmgT1L4JwFvzxs&s=lCEhrz6wQJUUaJOkxFmtOszAgkf3Jh4reX_i1GbI5RI&e= >>> >>> You have not traced ixgbe to understand why driver hits >>> "clean_complete=false" all the time ? >> >> Eric, >> >> I'm looking at commit 39e6c8208d7b and wondering that it's doing >> clear_bit(NAPI_STATE_MISSED,..); >> for busy_poll_stop(), but not for netpoll. >> Can that be an issue? >> >> and then something like below is needed: >> diff --git a/net/core/netpoll.c b/net/core/netpoll.c >> index 57557a6a950c..a848be6b503c 100644 >> --- a/net/core/netpoll.c >> +++ b/net/core/netpoll.c >> @@ -172,6 +172,7 @@ static void poll_one_napi(struct napi_struct *napi) >> trace_napi_poll(napi, work, 0); >> >> clear_bit(NAPI_STATE_NPSVC, &napi->state); >> + clear_bit(NAPI_STATE_MISSED, &napi->state); >> } > > > NAPI_STATE_MISSED should only be cleared under strict circumstances. > > The clear in busy_poll_stop() is an optimization really (as explained > in the comment) > > It is cleared when napi_complete_done() is eventually called, but if > ixgbe always handle 64 RX frames in its poll function, > napi_complete_done() will not be called. The bug is in ixgbe, > pretending its poll function should be called forever. Looks like a patch like the following fixes the issue for ixgbe. But I cannot explain it yet. Does this ring a bell? Thanks, Song diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 787c84fb20dd..51611f799dae 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -3059,11 +3059,14 @@ static irqreturn_t ixgbe_msix_other(int irq, void *data) static irqreturn_t ixgbe_msix_clean_rings(int irq, void *data) { struct ixgbe_q_vector *q_vector = data; + struct napi_struct *napi = &q_vector->napi; /* EIAM disabled interrupts (on this vector) for us */ - if (q_vector->rx.ring || q_vector->tx.ring) - napi_schedule_irqoff(&q_vector->napi); + if ((q_vector->rx.ring || q_vector->tx.ring) && + !napi_disable_pending(napi) && + !test_and_set_bit(NAPI_STATE_SCHED, &napi->state)) + __napi_schedule_irqoff(napi); return IRQ_HANDLED; } ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 20:37 ` Song Liu @ 2018-09-18 21:13 ` Eric Dumazet 2018-09-18 21:21 ` Eric Dumazet ` (2 more replies) 0 siblings, 3 replies; 22+ messages in thread From: Eric Dumazet @ 2018-09-18 21:13 UTC (permalink / raw) To: songliubraving Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com> wrote: > > Looks like a patch like the following fixes the issue for ixgbe. But I > cannot explain it yet. > > Does this ring a bell? I dunno, it looks like the NIC is generating an interrupt while it should not, and constantly sets NAPI_STATE_MISSED. Or maybe we need to properly check napi_complete_done() return value. diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c b/drivers/net/ethernet/intel/ixgb/ixgb_main.c index d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedcdcb72dc1c401813 100644 --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int budget) ixgb_clean_rx_irq(adapter, &work_done, budget); /* If budget not fully consumed, exit the polling mode */ - if (work_done < budget) { - napi_complete_done(napi, work_done); + if (work_done < budget && + napi_complete_done(napi, work_done)) { if (!test_bit(__IXGB_DOWN, &adapter->flags)) ixgb_irq_enable(adapter); } ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:13 ` Eric Dumazet @ 2018-09-18 21:21 ` Eric Dumazet 2018-09-18 21:36 ` Jeff Kirsher 2018-09-18 21:21 ` Song Liu 2018-09-18 21:25 ` Florian Fainelli 2 siblings, 1 reply; 22+ messages in thread From: Eric Dumazet @ 2018-09-18 21:21 UTC (permalink / raw) To: Eric Dumazet, songliubraving Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team On 09/18/2018 02:13 PM, Eric Dumazet wrote: > On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com> wrote: >> > >> Looks like a patch like the following fixes the issue for ixgbe. But I >> cannot explain it yet. >> >> Does this ring a bell? > > I dunno, it looks like the NIC is generating an interrupt while it should not, > and constantly sets NAPI_STATE_MISSED. > > Or maybe we need to properly check napi_complete_done() return value. > > diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c > b/drivers/net/ethernet/intel/ixgb/ixgb_main.c > index d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedcdcb72dc1c401813 > 100644 > --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c > +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c > @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int budget) > ixgb_clean_rx_irq(adapter, &work_done, budget); > > /* If budget not fully consumed, exit the polling mode */ > - if (work_done < budget) { > - napi_complete_done(napi, work_done); > + if (work_done < budget && > + napi_complete_done(napi, work_done)) { > if (!test_bit(__IXGB_DOWN, &adapter->flags)) > ixgb_irq_enable(adapter); > } > ixgbe patch would be : diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 604282f03d236e4358fc91e64d8ba00a9b36cb8c..80d00aecb6e3e3e950ce6309bfe3639953dd73d9 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -3196,12 +3196,12 @@ int ixgbe_poll(struct napi_struct *napi, int budget) return budget; /* all work done, exit the polling mode */ - napi_complete_done(napi, work_done); - if (adapter->rx_itr_setting & 1) - ixgbe_set_itr(q_vector); - if (!test_bit(__IXGBE_DOWN, &adapter->state)) - ixgbe_irq_enable_queues(adapter, BIT_ULL(q_vector->v_idx)); - + if (napi_complete_done(napi, work_done)) { + if (adapter->rx_itr_setting & 1) + ixgbe_set_itr(q_vector); + if (!test_bit(__IXGBE_DOWN, &adapter->state)) + ixgbe_irq_enable_queues(adapter, BIT_ULL(q_vector->v_idx)); + } return min(work_done, budget - 1); } ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:21 ` Eric Dumazet @ 2018-09-18 21:36 ` Jeff Kirsher 2018-09-18 21:40 ` Song Liu 0 siblings, 1 reply; 22+ messages in thread From: Jeff Kirsher @ 2018-09-18 21:36 UTC (permalink / raw) To: Eric Dumazet, Eric Dumazet, songliubraving Cc: Alexei Starovoitov, netdev, Alexander Duyck, michael.chan, kernel-team [-- Attachment #1: Type: text/plain, Size: 2907 bytes --] On Tue, 2018-09-18 at 14:21 -0700, Eric Dumazet wrote: > > On 09/18/2018 02:13 PM, Eric Dumazet wrote: > > On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com> > > wrote: > > > > > > Looks like a patch like the following fixes the issue for ixgbe. > > > But I > > > cannot explain it yet. > > > > > > Does this ring a bell? > > > > I dunno, it looks like the NIC is generating an interrupt while it > > should not, > > and constantly sets NAPI_STATE_MISSED. > > > > Or maybe we need to properly check napi_complete_done() return > > value. > > > > diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c > > b/drivers/net/ethernet/intel/ixgb/ixgb_main.c > > index > > d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedc > > dcb72dc1c401813 > > 100644 > > --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c > > +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c > > @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int > > budget) > > ixgb_clean_rx_irq(adapter, &work_done, budget); > > > > /* If budget not fully consumed, exit the polling mode */ > > - if (work_done < budget) { > > - napi_complete_done(napi, work_done); > > + if (work_done < budget && > > + napi_complete_done(napi, work_done)) { > > if (!test_bit(__IXGB_DOWN, &adapter->flags)) > > ixgb_irq_enable(adapter); > > } > > > > > ixgbe patch would be : > > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c > b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c > index > 604282f03d236e4358fc91e64d8ba00a9b36cb8c..80d00aecb6e3e3e950ce6309bfe > 3639953dd73d9 100644 > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c > @@ -3196,12 +3196,12 @@ int ixgbe_poll(struct napi_struct *napi, int > budget) > return budget; > > /* all work done, exit the polling mode */ > - napi_complete_done(napi, work_done); > - if (adapter->rx_itr_setting & 1) > - ixgbe_set_itr(q_vector); > - if (!test_bit(__IXGBE_DOWN, &adapter->state)) > - ixgbe_irq_enable_queues(adapter, BIT_ULL(q_vector- > >v_idx)); > - > + if (napi_complete_done(napi, work_done)) { > + if (adapter->rx_itr_setting & 1) > + ixgbe_set_itr(q_vector); > + if (!test_bit(__IXGBE_DOWN, &adapter->state)) > + ixgbe_irq_enable_queues(adapter, > BIT_ULL(q_vector->v_idx)); > + } > return min(work_done, budget - 1); > } > Eric, after Song does some testing on these changes, will you be submitting a formal patch? If so, make sure to send it to intel-wired-lan@lists.osuosl.org mailing list so I can pick up the fix. By the way, thanks! [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:36 ` Jeff Kirsher @ 2018-09-18 21:40 ` Song Liu 2018-09-18 21:46 ` Eric Dumazet 0 siblings, 1 reply; 22+ messages in thread From: Song Liu @ 2018-09-18 21:40 UTC (permalink / raw) To: jeffrey.t.kirsher@intel.com Cc: Eric Dumazet, Eric Dumazet, Alexei Starovoitov, netdev, Alexander Duyck, michael.chan@broadcom.com, Kernel Team > On Sep 18, 2018, at 2:36 PM, Jeff Kirsher <jeffrey.t.kirsher@intel.com> wrote: > > On Tue, 2018-09-18 at 14:21 -0700, Eric Dumazet wrote: >> >> On 09/18/2018 02:13 PM, Eric Dumazet wrote: >>> On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com> >>> wrote: >>>> >>>> Looks like a patch like the following fixes the issue for ixgbe. >>>> But I >>>> cannot explain it yet. >>>> >>>> Does this ring a bell? >>> >>> I dunno, it looks like the NIC is generating an interrupt while it >>> should not, >>> and constantly sets NAPI_STATE_MISSED. >>> >>> Or maybe we need to properly check napi_complete_done() return >>> value. >>> >>> diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c >>> b/drivers/net/ethernet/intel/ixgb/ixgb_main.c >>> index >>> d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedc >>> dcb72dc1c401813 >>> 100644 >>> --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c >>> +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c >>> @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int >>> budget) >>> ixgb_clean_rx_irq(adapter, &work_done, budget); >>> >>> /* If budget not fully consumed, exit the polling mode */ >>> - if (work_done < budget) { >>> - napi_complete_done(napi, work_done); >>> + if (work_done < budget && >>> + napi_complete_done(napi, work_done)) { >>> if (!test_bit(__IXGB_DOWN, &adapter->flags)) >>> ixgb_irq_enable(adapter); >>> } >>> >> >> >> ixgbe patch would be : >> >> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c >> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c >> index >> 604282f03d236e4358fc91e64d8ba00a9b36cb8c..80d00aecb6e3e3e950ce6309bfe >> 3639953dd73d9 100644 >> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c >> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c >> @@ -3196,12 +3196,12 @@ int ixgbe_poll(struct napi_struct *napi, int >> budget) >> return budget; >> >> /* all work done, exit the polling mode */ >> - napi_complete_done(napi, work_done); >> - if (adapter->rx_itr_setting & 1) >> - ixgbe_set_itr(q_vector); >> - if (!test_bit(__IXGBE_DOWN, &adapter->state)) >> - ixgbe_irq_enable_queues(adapter, BIT_ULL(q_vector- >>> v_idx)); >> - >> + if (napi_complete_done(napi, work_done)) { Shall we add likely() here? >> + if (adapter->rx_itr_setting & 1) >> + ixgbe_set_itr(q_vector); >> + if (!test_bit(__IXGBE_DOWN, &adapter->state)) >> + ixgbe_irq_enable_queues(adapter, >> BIT_ULL(q_vector->v_idx)); >> + } >> return min(work_done, budget - 1); >> } >> > > Eric, after Song does some testing on these changes, will you be > submitting a formal patch? If so, make sure to send it to > intel-wired-lan@lists.osuosl.org mailing list so I can pick up the fix. > > By the way, thanks! I would submit the patch if Eric prefer not to. :) Thanks, Song ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:40 ` Song Liu @ 2018-09-18 21:46 ` Eric Dumazet 2018-09-18 21:55 ` Song Liu 0 siblings, 1 reply; 22+ messages in thread From: Eric Dumazet @ 2018-09-18 21:46 UTC (permalink / raw) To: songliubraving Cc: Jeff Kirsher, Eric Dumazet, Alexei Starovoitov, netdev, Alexander Duyck, michael.chan, kernel-team On Tue, Sep 18, 2018 at 2:41 PM Song Liu <songliubraving@fb.com> wrote: > > > > > > I would submit the patch if Eric prefer not to. :) Hmmm.... maybe the bug is really in ixgbe_netpoll() This whole ndo_poll_controller() stuff is crazy. All sane implementations should only call napi_schedule() ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:46 ` Eric Dumazet @ 2018-09-18 21:55 ` Song Liu 2018-09-18 22:04 ` Eric Dumazet 0 siblings, 1 reply; 22+ messages in thread From: Song Liu @ 2018-09-18 21:55 UTC (permalink / raw) To: Eric Dumazet Cc: Jeff Kirsher, Eric Dumazet, Alexei Starovoitov, netdev, Alexander Duyck, michael.chan@broadcom.com, Kernel Team > On Sep 18, 2018, at 2:46 PM, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, Sep 18, 2018 at 2:41 PM Song Liu <songliubraving@fb.com> wrote: >> >> >> >>> >> I would submit the patch if Eric prefer not to. :) > > > Hmmm.... maybe the bug is really in ixgbe_netpoll() > > This whole ndo_poll_controller() stuff is crazy. > > All sane implementations should only call napi_schedule() Current implement is about identical to napi_schedule_irqoff(). Do we really need napi_schedule() instead? On the other hand, I think we should check napi_complete_done() in ixgbe_poll() anyway. Song ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:55 ` Song Liu @ 2018-09-18 22:04 ` Eric Dumazet 0 siblings, 0 replies; 22+ messages in thread From: Eric Dumazet @ 2018-09-18 22:04 UTC (permalink / raw) To: songliubraving Cc: Jeff Kirsher, Eric Dumazet, Alexei Starovoitov, netdev, Alexander Duyck, michael.chan, kernel-team On Tue, Sep 18, 2018 at 2:56 PM Song Liu <songliubraving@fb.com> wrote: > > > > > On Sep 18, 2018, at 2:46 PM, Eric Dumazet <edumazet@google.com> wrote: > > > > On Tue, Sep 18, 2018 at 2:41 PM Song Liu <songliubraving@fb.com> wrote: > >> > >> > >> > >>> > >> I would submit the patch if Eric prefer not to. :) > > > > > > Hmmm.... maybe the bug is really in ixgbe_netpoll() > > > > This whole ndo_poll_controller() stuff is crazy. > > > > All sane implementations should only call napi_schedule() > > Current implement is about identical to napi_schedule_irqoff(). Do > we really need napi_schedule() instead? > > On the other hand, I think we should check napi_complete_done() in > ixgbe_poll() anyway. It seems the netpoll code is racy, since another cpu might be calling ixgbe poll(), and return early from napi_complete_done() : if (unlikely(n->state & (NAPIF_STATE_NPSVC | NAPIF_STATE_IN_BUSY_POLL))) return false; This is why netpoll enabled drivers _must_ check the napi_complete[_done]() return value, otherwise they might re-enable IRQs why they should not. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:13 ` Eric Dumazet 2018-09-18 21:21 ` Eric Dumazet @ 2018-09-18 21:21 ` Song Liu 2018-09-18 21:25 ` Eric Dumazet 2018-09-18 21:25 ` Florian Fainelli 2 siblings, 1 reply; 22+ messages in thread From: Song Liu @ 2018-09-18 21:21 UTC (permalink / raw) To: Eric Dumazet Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com, Kernel Team > On Sep 18, 2018, at 2:13 PM, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com> wrote: >> > >> Looks like a patch like the following fixes the issue for ixgbe. But I >> cannot explain it yet. >> >> Does this ring a bell? > > I dunno, it looks like the NIC is generating an interrupt while it should not, > and constantly sets NAPI_STATE_MISSED. > > Or maybe we need to properly check napi_complete_done() return value. > > diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c > b/drivers/net/ethernet/intel/ixgb/ixgb_main.c > index d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedcdcb72dc1c401813 > 100644 > --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c > +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c > @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int budget) > ixgb_clean_rx_irq(adapter, &work_done, budget); > > /* If budget not fully consumed, exit the polling mode */ > - if (work_done < budget) { > - napi_complete_done(napi, work_done); > + if (work_done < budget && > + napi_complete_done(napi, work_done)) { > if (!test_bit(__IXGB_DOWN, &adapter->flags)) > ixgb_irq_enable(adapter); > } Thanks Eric! I was looking at exactly this part. :) And it seems working! I will run a bigger test and update shortly. Best, Song ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:21 ` Song Liu @ 2018-09-18 21:25 ` Eric Dumazet 0 siblings, 0 replies; 22+ messages in thread From: Eric Dumazet @ 2018-09-18 21:25 UTC (permalink / raw) To: songliubraving Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team On Tue, Sep 18, 2018 at 2:22 PM Song Liu <songliubraving@fb.com> wrote: > > Thanks Eric! > > I was looking at exactly this part. :) And it seems working! And this should also make busy polling a bit faster as well, avoiding enabling/receiving interrupts in the busy loop. > > I will run a bigger test and update shortly. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:13 ` Eric Dumazet 2018-09-18 21:21 ` Eric Dumazet 2018-09-18 21:21 ` Song Liu @ 2018-09-18 21:25 ` Florian Fainelli 2018-09-18 21:28 ` Eric Dumazet 2 siblings, 1 reply; 22+ messages in thread From: Florian Fainelli @ 2018-09-18 21:25 UTC (permalink / raw) To: Eric Dumazet, songliubraving Cc: Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team On 09/18/2018 02:13 PM, Eric Dumazet wrote: > On Tue, Sep 18, 2018 at 1:37 PM Song Liu <songliubraving@fb.com> wrote: >> > >> Looks like a patch like the following fixes the issue for ixgbe. But I >> cannot explain it yet. >> >> Does this ring a bell? > > I dunno, it looks like the NIC is generating an interrupt while it should not, > and constantly sets NAPI_STATE_MISSED. > > Or maybe we need to properly check napi_complete_done() return value. > > diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c > b/drivers/net/ethernet/intel/ixgb/ixgb_main.c > index d3e72d0f66ef428b08e4bd88508e05b734bc43a4..c4c565c982a98a5891603cedcdcb72dc1c401813 > 100644 > --- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c > +++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c > @@ -1773,8 +1773,8 @@ ixgb_clean(struct napi_struct *napi, int budget) > ixgb_clean_rx_irq(adapter, &work_done, budget); > > /* If budget not fully consumed, exit the polling mode */ > - if (work_done < budget) { > - napi_complete_done(napi, work_done); > + if (work_done < budget && > + napi_complete_done(napi, work_done)) { > if (!test_bit(__IXGB_DOWN, &adapter->flags)) > ixgb_irq_enable(adapter); > } This would not be the only driver doing this unfortunately... should we add a __must_check annotation to help catch those (mis)uses? Though that could cause false positives for drivers using NAPI to clean their TX ring. -- Florian ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:25 ` Florian Fainelli @ 2018-09-18 21:28 ` Eric Dumazet 2018-09-18 21:35 ` Florian Fainelli 2018-09-18 21:36 ` Song Liu 0 siblings, 2 replies; 22+ messages in thread From: Eric Dumazet @ 2018-09-18 21:28 UTC (permalink / raw) To: Florian Fainelli Cc: songliubraving, Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team On Tue, Sep 18, 2018 at 2:25 PM Florian Fainelli <f.fainelli@gmail.com> wrote: > > > This would not be the only driver doing this unfortunately... should we > add a __must_check annotation to help catch those (mis)uses? Though that > could cause false positives for drivers using NAPI to clean their TX ring. > Well, before adding __must_check we would need to cook a (big) patch series to change all occurrences. Not clear why netpoll is the trigger ? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:28 ` Eric Dumazet @ 2018-09-18 21:35 ` Florian Fainelli 2018-09-18 21:36 ` Song Liu 1 sibling, 0 replies; 22+ messages in thread From: Florian Fainelli @ 2018-09-18 21:35 UTC (permalink / raw) To: Eric Dumazet Cc: songliubraving, Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck, michael.chan, kernel-team On 09/18/2018 02:28 PM, Eric Dumazet wrote: > On Tue, Sep 18, 2018 at 2:25 PM Florian Fainelli <f.fainelli@gmail.com> wrote: >> >> >> This would not be the only driver doing this unfortunately... should we >> add a __must_check annotation to help catch those (mis)uses? Though that >> could cause false positives for drivers using NAPI to clean their TX ring. >> > > Well, before adding __must_check we would need to cook a (big) patch > series to change all occurrences. > Sounds good, I don't mind submitting something unless you beat me to it. > > Not clear why netpoll is the trigger ? > Me either, this should be observable during normal operation as well, though most likely you just get a spurious interrupt and nothing to clean in the RX ring, so things just get unnoticed? -- Florian ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: pegged softirq and NAPI race (?) 2018-09-18 21:28 ` Eric Dumazet 2018-09-18 21:35 ` Florian Fainelli @ 2018-09-18 21:36 ` Song Liu 1 sibling, 0 replies; 22+ messages in thread From: Song Liu @ 2018-09-18 21:36 UTC (permalink / raw) To: Eric Dumazet Cc: Florian Fainelli, Alexei Starovoitov, netdev, Jeff Kirsher, Alexander Duyck, michael.chan@broadcom.com, Kernel Team > On Sep 18, 2018, at 2:28 PM, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, Sep 18, 2018 at 2:25 PM Florian Fainelli <f.fainelli@gmail.com> wrote: >> >> >> This would not be the only driver doing this unfortunately... should we >> add a __must_check annotation to help catch those (mis)uses? Though that >> could cause false positives for drivers using NAPI to clean their TX ring. >> > > Well, before adding __must_check we would need to cook a (big) patch > series to change all occurrences. > > > Not clear why netpoll is the trigger ? >From my observation, the trigger path is: netpoll_poll_dev() => ndo_poll_controller() => napi_schedule_prep(). I guess it is a race between IRQ/netpoll=>napi_schedule_prep() and not handled napi_complete_done(). netpoll just makes it triggers more often? Unfortunately for me, bnxt already checked napi_complete_done(), so there should be another bug with bnxt (at least with our 4.11 based kernel). Song ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2018-09-19 3:39 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-09-18 8:41 pegged softirq and NAPI race (?) Song Liu 2018-09-18 13:45 ` Eric Dumazet 2018-09-18 16:19 ` Song Liu 2018-09-18 16:31 ` Rik van Riel 2018-09-18 16:33 ` Eric Dumazet 2018-09-18 16:49 ` Song Liu 2018-09-18 17:51 ` Alexei Starovoitov 2018-09-18 18:17 ` Eric Dumazet 2018-09-18 20:37 ` Song Liu 2018-09-18 21:13 ` Eric Dumazet 2018-09-18 21:21 ` Eric Dumazet 2018-09-18 21:36 ` Jeff Kirsher 2018-09-18 21:40 ` Song Liu 2018-09-18 21:46 ` Eric Dumazet 2018-09-18 21:55 ` Song Liu 2018-09-18 22:04 ` Eric Dumazet 2018-09-18 21:21 ` Song Liu 2018-09-18 21:25 ` Eric Dumazet 2018-09-18 21:25 ` Florian Fainelli 2018-09-18 21:28 ` Eric Dumazet 2018-09-18 21:35 ` Florian Fainelli 2018-09-18 21:36 ` Song Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).