* [Intel-wired-lan] [PATCH net-next v1 0/2] GRO drop accounting @ 2021-01-06 21:55 Jesse Brandeburg 2021-01-06 21:55 ` [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg 2021-01-06 21:55 ` [Intel-wired-lan] [PATCH net-next v1 2/2] ice: remove GRO drop accounting Jesse Brandeburg 0 siblings, 2 replies; 16+ messages in thread From: Jesse Brandeburg @ 2021-01-06 21:55 UTC (permalink / raw) To: intel-wired-lan Add some accounting for when the stack drops a packet that a driver tried to indicate with a gro* call. This helps users track where packets might have disappeared to and will show up in the netdevice stats that already exist. After that, remove the driver specific workaround that was added to do the same, just scoped too small. Jesse Brandeburg (2): net: core: count drops from GRO ice: remove GRO drop accounting drivers/net/ethernet/intel/ice/ice.h | 1 - drivers/net/ethernet/intel/ice/ice_ethtool.c | 1 - drivers/net/ethernet/intel/ice/ice_main.c | 4 +--- drivers/net/ethernet/intel/ice/ice_txrx.h | 1 - drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 -- net/core/dev.c | 2 ++ 6 files changed, 3 insertions(+), 8 deletions(-) -- 2.29.2 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-06 21:55 [Intel-wired-lan] [PATCH net-next v1 0/2] GRO drop accounting Jesse Brandeburg @ 2021-01-06 21:55 ` Jesse Brandeburg 2021-01-07 18:47 ` Jacob Keller ` (2 more replies) 2021-01-06 21:55 ` [Intel-wired-lan] [PATCH net-next v1 2/2] ice: remove GRO drop accounting Jesse Brandeburg 1 sibling, 3 replies; 16+ messages in thread From: Jesse Brandeburg @ 2021-01-06 21:55 UTC (permalink / raw) To: intel-wired-lan When drivers call the various receive upcalls to receive an skb to the stack, sometimes that stack can drop the packet. The good news is that the return code is given to all the drivers of NET_RX_DROP or GRO_DROP. The bad news is that no drivers except the one "ice" driver that I changed, check the stat and increment the dropped count. This is currently leading to packets that arrive at the edge interface and are fully handled by the driver and then mysteriously disappear. Rather than fix all drivers to increment the drop stat when handling the return code, emulate the already existing statistic update for NET_RX_DROP events for the two GRO_DROP locations, and increment the dev->rx_dropped associated with the skb. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> --- net/core/dev.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/core/dev.c b/net/core/dev.c index 8fa739259041..ef34043a9550 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, break; case GRO_DROP: + atomic_long_inc(&skb->dev->rx_dropped); kfree_skb(skb); break; @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, break; case GRO_DROP: + atomic_long_inc(&skb->dev->rx_dropped); napi_reuse_skb(napi, skb); break; -- 2.29.2 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-06 21:55 ` [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg @ 2021-01-07 18:47 ` Jacob Keller 2021-01-07 21:15 ` Alexander Duyck 2021-01-08 18:23 ` Jesse Brandeburg 2021-01-08 0:50 ` Shannon Nelson 2021-01-08 9:25 ` Eric Dumazet 2 siblings, 2 replies; 16+ messages in thread From: Jacob Keller @ 2021-01-07 18:47 UTC (permalink / raw) To: intel-wired-lan On 1/6/2021 1:55 PM, Jesse Brandeburg wrote: > When drivers call the various receive upcalls to receive an skb > to the stack, sometimes that stack can drop the packet. The good > news is that the return code is given to all the drivers of > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except > the one "ice" driver that I changed, check the stat and increment > the dropped count. This is currently leading to packets that > arrive at the edge interface and are fully handled by the driver > and then mysteriously disappear. > > Rather than fix all drivers to increment the drop stat when > handling the return code, emulate the already existing statistic > update for NET_RX_DROP events for the two GRO_DROP locations, and > increment the dev->rx_dropped associated with the skb. > > Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Jamal Hadi Salim <jhs@mojatatu.com> > --- > net/core/dev.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 8fa739259041..ef34043a9550 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, > break; > > case GRO_DROP: > + atomic_long_inc(&skb->dev->rx_dropped); > kfree_skb(skb); > break; Would it makes sense to have this be a different stat? or is it really basically the same as the existing rx_dropped, so treating it differently wouldn't make much sense.. > > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, > break; > > case GRO_DROP: > + atomic_long_inc(&skb->dev->rx_dropped); > napi_reuse_skb(napi, skb); > break; > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-07 18:47 ` Jacob Keller @ 2021-01-07 21:15 ` Alexander Duyck 2021-01-08 18:23 ` Jesse Brandeburg 1 sibling, 0 replies; 16+ messages in thread From: Alexander Duyck @ 2021-01-07 21:15 UTC (permalink / raw) To: intel-wired-lan On Thu, Jan 7, 2021 at 10:47 AM Jacob Keller <jacob.e.keller@intel.com> wrote: > > > > On 1/6/2021 1:55 PM, Jesse Brandeburg wrote: > > When drivers call the various receive upcalls to receive an skb > > to the stack, sometimes that stack can drop the packet. The good > > news is that the return code is given to all the drivers of > > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except > > the one "ice" driver that I changed, check the stat and increment > > the dropped count. This is currently leading to packets that > > arrive at the edge interface and are fully handled by the driver > > and then mysteriously disappear. > > > > Rather than fix all drivers to increment the drop stat when > > handling the return code, emulate the already existing statistic > > update for NET_RX_DROP events for the two GRO_DROP locations, and > > increment the dev->rx_dropped associated with the skb. > > > > Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > > Cc: Eric Dumazet <edumazet@google.com> > > Cc: Jamal Hadi Salim <jhs@mojatatu.com> > > --- > > net/core/dev.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index 8fa739259041..ef34043a9550 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, > > break; > > > > case GRO_DROP: > > + atomic_long_inc(&skb->dev->rx_dropped); > > kfree_skb(skb); > > break; > > Would it makes sense to have this be a different stat? or is it really > basically the same as the existing rx_dropped, so treating it > differently wouldn't make much sense.. I'm not seeing why this is anything that we really need to track. From what I can tell GRO_DROP is only returned in one case, and that is if we are using napi_gro_frags and napi_frags_skb returns NULL. I cannot see how you can actually return GRO_DROP to the two functions in question as it looks like dev_gro_receive. Are these paths perhaps dead code? Also it doesn't make much sense to free the skb in the GRO_DROP case as it looks like the skb has already been recycled. It might make more sense to add the counter in napi_frags_skb in the case where we are going to return NULL and reset the NAPI skb, and maybe look at dropping these code paths since I don't think it is possible for us to get here. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-07 18:47 ` Jacob Keller 2021-01-07 21:15 ` Alexander Duyck @ 2021-01-08 18:23 ` Jesse Brandeburg 1 sibling, 0 replies; 16+ messages in thread From: Jesse Brandeburg @ 2021-01-08 18:23 UTC (permalink / raw) To: intel-wired-lan Jacob Keller wrote: > > case GRO_DROP: > > + atomic_long_inc(&skb->dev->rx_dropped); > > kfree_skb(skb); > > break; > > Would it makes sense to have this be a different stat? or is it really > basically the same as the existing rx_dropped, so treating it > differently wouldn't make much sense.. not sure, was hoping to get feedback here. More later in the thread... ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-06 21:55 ` [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg 2021-01-07 18:47 ` Jacob Keller @ 2021-01-08 0:50 ` Shannon Nelson 2021-01-08 18:26 ` Jesse Brandeburg 2021-01-08 9:25 ` Eric Dumazet 2 siblings, 1 reply; 16+ messages in thread From: Shannon Nelson @ 2021-01-08 0:50 UTC (permalink / raw) To: intel-wired-lan On 1/6/21 1:55 PM, Jesse Brandeburg wrote: > When drivers call the various receive upcalls to receive an skb > to the stack, sometimes that stack can drop the packet. The good > news is that the return code is given to all the drivers of > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except > the one "ice" driver that I changed, check the stat and increment If the stack is dropping the packet, isn't it up to the stack to track that, perhaps with something that shows up in netstat -s?? We don't really want to make the driver responsible for any drops that happen above its head, do we? sln > the dropped count. This is currently leading to packets that > arrive at the edge interface and are fully handled by the driver > and then mysteriously disappear. > > Rather than fix all drivers to increment the drop stat when > handling the return code, emulate the already existing statistic > update for NET_RX_DROP events for the two GRO_DROP locations, and > increment the dev->rx_dropped associated with the skb. > > Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Jamal Hadi Salim <jhs@mojatatu.com> > --- > net/core/dev.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 8fa739259041..ef34043a9550 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, > break; > > case GRO_DROP: > + atomic_long_inc(&skb->dev->rx_dropped); > kfree_skb(skb); > break; > > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, > break; > > case GRO_DROP: > + atomic_long_inc(&skb->dev->rx_dropped); > napi_reuse_skb(napi, skb); > break; > ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-08 0:50 ` Shannon Nelson @ 2021-01-08 18:26 ` Jesse Brandeburg 2021-01-08 19:21 ` Shannon Nelson 0 siblings, 1 reply; 16+ messages in thread From: Jesse Brandeburg @ 2021-01-08 18:26 UTC (permalink / raw) To: intel-wired-lan Shannon Nelson wrote: > On 1/6/21 1:55 PM, Jesse Brandeburg wrote: > > When drivers call the various receive upcalls to receive an skb > > to the stack, sometimes that stack can drop the packet. The good > > news is that the return code is given to all the drivers of > > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except > > the one "ice" driver that I changed, check the stat and increment > > If the stack is dropping the packet, isn't it up to the stack to track > that, perhaps with something that shows up in netstat -s?? We don't > really want to make the driver responsible for any drops that happen > above its head, do we? I totally agree! In patch 2/2 I revert the driver-specific changes I had made in an earlier patch, and this patch *was* my effort to make the stack show the drops. Maybe I wasn't clear. I'm seeing packets disappear during TCP workloads, and this GRO_DROP code was the source of the drops (I see it returning infrequently but regularly) The driver processes the packet but the stack never sees it, and there were no drop counters anywhere tracking it. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-08 18:26 ` Jesse Brandeburg @ 2021-01-08 19:21 ` Shannon Nelson 2021-01-08 20:26 ` Saeed Mahameed 2021-01-14 13:53 ` Jamal Hadi Salim 0 siblings, 2 replies; 16+ messages in thread From: Shannon Nelson @ 2021-01-08 19:21 UTC (permalink / raw) To: intel-wired-lan On 1/8/21 10:26 AM, Jesse Brandeburg wrote: > Shannon Nelson wrote: > >> On 1/6/21 1:55 PM, Jesse Brandeburg wrote: >>> When drivers call the various receive upcalls to receive an skb >>> to the stack, sometimes that stack can drop the packet. The good >>> news is that the return code is given to all the drivers of >>> NET_RX_DROP or GRO_DROP. The bad news is that no drivers except >>> the one "ice" driver that I changed, check the stat and increment >> If the stack is dropping the packet, isn't it up to the stack to track >> that, perhaps with something that shows up in netstat -s?? We don't >> really want to make the driver responsible for any drops that happen >> above its head, do we? > I totally agree! > > In patch 2/2 I revert the driver-specific changes I had made in an > earlier patch, and this patch *was* my effort to make the stack show the > drops. > > Maybe I wasn't clear. I'm seeing packets disappear during TCP > workloads, and this GRO_DROP code was the source of the drops (I see it > returning infrequently but regularly) > > The driver processes the packet but the stack never sees it, and there > were no drop counters anywhere tracking it. > My point is that the patch increments a netdev counter, which to my mind immediately implicates the driver and hardware, rather than the stack.? As a driver maintainer, I don't want to be chasing driver packet drop reports that are a stack problem.? I'd rather see a new counter in netstat -s that reflects the stack decision and can better imply what went wrong.? I don't have a good suggestion for a counter name at the moment. I guess part of the issue is that this is right on the boundary of driver-stack.? But if we follow Eric's suggestions, maybe the problem magically goes away :-) . sln ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-08 19:21 ` Shannon Nelson @ 2021-01-08 20:26 ` Saeed Mahameed 2021-01-08 22:17 ` Eric Dumazet 2021-01-14 13:53 ` Jamal Hadi Salim 1 sibling, 1 reply; 16+ messages in thread From: Saeed Mahameed @ 2021-01-08 20:26 UTC (permalink / raw) To: intel-wired-lan On Fri, 2021-01-08 at 11:21 -0800, Shannon Nelson wrote: > On 1/8/21 10:26 AM, Jesse Brandeburg wrote: > > Shannon Nelson wrote: > > > > > On 1/6/21 1:55 PM, Jesse Brandeburg wrote: > > > > When drivers call the various receive upcalls to receive an skb > > > > to the stack, sometimes that stack can drop the packet. The > > > > good > > > > news is that the return code is given to all the drivers of > > > > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except > > > > the one "ice" driver that I changed, check the stat and > > > > increment > > > If the stack is dropping the packet, isn't it up to the stack to > > > track > > > that, perhaps with something that shows up in netstat -s? We > > > don't > > > really want to make the driver responsible for any drops that > > > happen > > > above its head, do we? > > I totally agree! > > > > In patch 2/2 I revert the driver-specific changes I had made in an > > earlier patch, and this patch *was* my effort to make the stack > > show the > > drops. > > > > Maybe I wasn't clear. I'm seeing packets disappear during TCP > > workloads, and this GRO_DROP code was the source of the drops (I > > see it > > returning infrequently but regularly) > > > > The driver processes the packet but the stack never sees it, and > > there > > were no drop counters anywhere tracking it. > > > > My point is that the patch increments a netdev counter, which to my > mind > immediately implicates the driver and hardware, rather than the > stack. > As a driver maintainer, I don't want to be chasing driver packet > drop > reports that are a stack problem. I'd rather see a new counter in > netstat -s that reflects the stack decision and can better imply > what > went wrong. I don't have a good suggestion for a counter name at > the > moment. > > I guess part of the issue is that this is right on the boundary of > driver-stack. But if we follow Eric's suggestions, maybe the > problem > magically goes away :-) . > > sln > I think there is still some merit in this patchset even with Eric's removal of GRO_DROP from gro_receive(). As Eric explained, it is still possible to silently drop for the same reason when drivers call napi_get_frags or even alloc_skb() apis, many drivers do not account for such packet drops, and maybe it is the right thing to do to inline the packet drop accounting into the skb alloc APIs ? the question is, is it the job of those APIs to update netdev->stats ? ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-08 20:26 ` Saeed Mahameed @ 2021-01-08 22:17 ` Eric Dumazet 0 siblings, 0 replies; 16+ messages in thread From: Eric Dumazet @ 2021-01-08 22:17 UTC (permalink / raw) To: intel-wired-lan On Fri, Jan 8, 2021 at 9:27 PM Saeed Mahameed <saeed@kernel.org> wrote: > > I think there is still some merit in this patchset even with Eric's > removal of GRO_DROP from gro_receive(). As Eric explained, it is still > possible to silently drop for the same reason when drivers > call napi_get_frags or even alloc_skb() apis, many drivers do not > account for such packet drops, and maybe it is the right thing to do to > inline the packet drop accounting into the skb alloc APIs ? the > question is, is it the job of those APIs to update netdev->stats ? > You absolutely do not want to have a generic increment of netdev->stats for multiqueue drivers. This would add terrible cache line false sharing under DDOS and memory stress. Each driver maintains (or should maintain) per rx queue counter for this case. It seems mlx4 does nothing special, I would suggest you fix it :) ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-08 19:21 ` Shannon Nelson 2021-01-08 20:26 ` Saeed Mahameed @ 2021-01-14 13:53 ` Jamal Hadi Salim 1 sibling, 0 replies; 16+ messages in thread From: Jamal Hadi Salim @ 2021-01-14 13:53 UTC (permalink / raw) To: intel-wired-lan On 2021-01-08 2:21 p.m., Shannon Nelson wrote: > On 1/8/21 10:26 AM, Jesse Brandeburg wrote: >> Shannon Nelson wrote: >> >>> On 1/6/21 1:55 PM, Jesse Brandeburg wrote: >>>> When drivers call the various receive upcalls to receive an skb >>>> to the stack, sometimes that stack can drop the packet. The good >>>> news is that the return code is given to all the drivers of >>>> NET_RX_DROP or GRO_DROP. The bad news is that no drivers except >>>> the one "ice" driver that I changed, check the stat and increment >>> If the stack is dropping the packet, isn't it up to the stack to track >>> that, perhaps with something that shows up in netstat -s?? We don't >>> really want to make the driver responsible for any drops that happen >>> above its head, do we? >> I totally agree! >> >> In patch 2/2 I revert the driver-specific changes I had made in an >> earlier patch, and this patch *was* my effort to make the stack show the >> drops. >> >> Maybe I wasn't clear. I'm seeing packets disappear during TCP >> workloads, and this GRO_DROP code was the source of the drops (I see it >> returning infrequently but regularly) >> >> The driver processes the packet but the stack never sees it, and there >> were no drop counters anywhere tracking it. >> > > My point is that the patch increments a netdev counter, which to my mind > immediately implicates the driver and hardware, rather than the stack. > As a driver maintainer, I don't want to be chasing driver packet drop > reports that are a stack problem.? I'd rather see a new counter in > netstat -s that reflects the stack decision and can better imply what > went wrong.? I don't have a good suggestion for a counter name at the > moment. > > I guess part of the issue is that this is right on the boundary of > driver-stack.? But if we follow Eric's suggestions, maybe the problem > magically goes away :-) . > So: How does one know that the stack-upcall dropped a packet because of GRO issues? Debugging with kprobe or traces doesnt count as an answer. cheers, jamal ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-06 21:55 ` [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg 2021-01-07 18:47 ` Jacob Keller 2021-01-08 0:50 ` Shannon Nelson @ 2021-01-08 9:25 ` Eric Dumazet 2021-01-08 18:35 ` Jesse Brandeburg 2 siblings, 1 reply; 16+ messages in thread From: Eric Dumazet @ 2021-01-08 9:25 UTC (permalink / raw) To: intel-wired-lan On Wed, Jan 6, 2021 at 10:56 PM Jesse Brandeburg <jesse.brandeburg@intel.com> wrote: > > When drivers call the various receive upcalls to receive an skb > to the stack, sometimes that stack can drop the packet. The good > news is that the return code is given to all the drivers of > NET_RX_DROP or GRO_DROP. The bad news is that no drivers except > the one "ice" driver that I changed, check the stat and increment > the dropped count. This is currently leading to packets that > arrive at the edge interface and are fully handled by the driver > and then mysteriously disappear. > > Rather than fix all drivers to increment the drop stat when > handling the return code, emulate the already existing statistic > update for NET_RX_DROP events for the two GRO_DROP locations, and > increment the dev->rx_dropped associated with the skb. > > Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Jamal Hadi Salim <jhs@mojatatu.com> > --- > net/core/dev.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 8fa739259041..ef34043a9550 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, > break; > > case GRO_DROP: > + atomic_long_inc(&skb->dev->rx_dropped); > kfree_skb(skb); > break; > > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, > break; > > case GRO_DROP: > + atomic_long_inc(&skb->dev->rx_dropped); > napi_reuse_skb(napi, skb); > break; > This is not needed. I think we should clean up ice instead. Drivers are supposed to have allocated the skb (using napi_get_frags()) before calling napi_gro_frags() Only napi_gro_frags() would return GRO_DROP, but we supposedly could crash at that point, since a driver is clearly buggy. We probably can remove GRO_DROP completely, assuming lazy drivers are fixed. diff --git a/net/core/dev.c b/net/core/dev.c index 8fa739259041aaa03585b5a7b8ebce862f4b7d1d..c9460c9597f1de51957fdcfc7a64ca45bce5af7c 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6223,9 +6223,6 @@ gro_result_t napi_gro_frags(struct napi_struct *napi) gro_result_t ret; struct sk_buff *skb = napi_frags_skb(napi); - if (!skb) - return GRO_DROP; - trace_napi_gro_frags_entry(skb); ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb)); ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-08 9:25 ` Eric Dumazet @ 2021-01-08 18:35 ` Jesse Brandeburg 2021-01-08 18:45 ` Eric Dumazet 0 siblings, 1 reply; 16+ messages in thread From: Jesse Brandeburg @ 2021-01-08 18:35 UTC (permalink / raw) To: intel-wired-lan Eric Dumazet wrote: > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, > > break; > > > > case GRO_DROP: > > + atomic_long_inc(&skb->dev->rx_dropped); > > kfree_skb(skb); > > break; > > > > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, > > break; > > > > case GRO_DROP: > > + atomic_long_inc(&skb->dev->rx_dropped); > > napi_reuse_skb(napi, skb); > > break; > > > > > This is not needed. I think we should clean up ice instead. My patch 2 already did that. I was trying to address the fact that I'm *actually seeing* GRO_DROP return codes coming back from stack. I'll try to reproduce that issue again that I saw. Maybe modern kernels don't have the problem as frequently or at all. > Drivers are supposed to have allocated the skb (using > napi_get_frags()) before calling napi_gro_frags() ice doesn't use napi_get_frags/napi_gro_frags, so I'm not sure how this is relevant. > Only napi_gro_frags() would return GRO_DROP, but we supposedly could > crash at that point, since a driver is clearly buggy. seems unlikely since we don't call those functions. > We probably can remove GRO_DROP completely, assuming lazy drivers are fixed. This might be ok, but doesn't explain why I was seeing this return code (which was the whole reason I was trying to count them), however I may have been running on a distro kernel from redhat/centos 8 when I was seeing these events. I haven't fully completed spelunking all the different sources, but might be able to follow down the rabbit hole further. > diff --git a/net/core/dev.c b/net/core/dev.c > index 8fa739259041aaa03585b5a7b8ebce862f4b7d1d..c9460c9597f1de51957fdcfc7a64ca45bce5af7c > 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -6223,9 +6223,6 @@ gro_result_t napi_gro_frags(struct napi_struct *napi) > gro_result_t ret; > struct sk_buff *skb = napi_frags_skb(napi); > > - if (!skb) > - return GRO_DROP; > - > trace_napi_gro_frags_entry(skb); > > ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb)); This change (noted from your other patches is fine), and a likely improvement, thanks for sending those! ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-08 18:35 ` Jesse Brandeburg @ 2021-01-08 18:45 ` Eric Dumazet 2021-01-09 0:54 ` Jacob Keller 0 siblings, 1 reply; 16+ messages in thread From: Eric Dumazet @ 2021-01-08 18:45 UTC (permalink / raw) To: intel-wired-lan On Fri, Jan 8, 2021 at 7:35 PM Jesse Brandeburg <jesse.brandeburg@intel.com> wrote: > > Eric Dumazet wrote: > > > --- a/net/core/dev.c > > > +++ b/net/core/dev.c > > > @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, > > > break; > > > > > > case GRO_DROP: > > > + atomic_long_inc(&skb->dev->rx_dropped); > > > kfree_skb(skb); > > > break; > > > > > > @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, > > > break; > > > > > > case GRO_DROP: > > > + atomic_long_inc(&skb->dev->rx_dropped); > > > napi_reuse_skb(napi, skb); > > > break; > > > > > > > > > This is not needed. I think we should clean up ice instead. > > My patch 2 already did that. I was trying to address the fact that I'm > *actually seeing* GRO_DROP return codes coming back from stack. > > I'll try to reproduce that issue again that I saw. Maybe modern kernels > don't have the problem as frequently or at all. Jesse, you are sending a patch for current kernels. It is pretty clear that the issue you have can not happen with current kernels, by reading the code source, even without an actual ICE piece of hardware to test this :) > > > Drivers are supposed to have allocated the skb (using > > napi_get_frags()) before calling napi_gro_frags() > > ice doesn't use napi_get_frags/napi_gro_frags, so I'm not sure how this > is relevant. > > > Only napi_gro_frags() would return GRO_DROP, but we supposedly could > > crash at that point, since a driver is clearly buggy. > > seems unlikely since we don't call those functions. > > > We probably can remove GRO_DROP completely, assuming lazy drivers are fixed. > > This might be ok, but doesn't explain why I was seeing this return > code (which was the whole reason I was trying to count them), however I > may have been running on a distro kernel from redhat/centos 8 when I > was seeing these events. I haven't fully completed spelunking all the > different sources, but might be able to follow down the rabbit hole > further. Yes please :) > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index 8fa739259041aaa03585b5a7b8ebce862f4b7d1d..c9460c9597f1de51957fdcfc7a64ca45bce5af7c > > 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -6223,9 +6223,6 @@ gro_result_t napi_gro_frags(struct napi_struct *napi) > > gro_result_t ret; > > struct sk_buff *skb = napi_frags_skb(napi); > > > > - if (!skb) > > - return GRO_DROP; > > - > > trace_napi_gro_frags_entry(skb); > > > > ret = napi_frags_finish(napi, skb, dev_gro_receive(napi, skb)); > > This change (noted from your other patches is fine), and a likely > improvement, thanks for sending those! Sure ! ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO 2021-01-08 18:45 ` Eric Dumazet @ 2021-01-09 0:54 ` Jacob Keller 0 siblings, 0 replies; 16+ messages in thread From: Jacob Keller @ 2021-01-09 0:54 UTC (permalink / raw) To: intel-wired-lan On 1/8/2021 10:45 AM, Eric Dumazet wrote: > On Fri, Jan 8, 2021 at 7:35 PM Jesse Brandeburg > <jesse.brandeburg@intel.com> wrote: >> >> Eric Dumazet wrote: >>>> --- a/net/core/dev.c >>>> +++ b/net/core/dev.c >>>> @@ -6071,6 +6071,7 @@ static gro_result_t napi_skb_finish(struct napi_struct *napi, >>>> break; >>>> >>>> case GRO_DROP: >>>> + atomic_long_inc(&skb->dev->rx_dropped); >>>> kfree_skb(skb); >>>> break; >>>> >>>> @@ -6159,6 +6160,7 @@ static gro_result_t napi_frags_finish(struct napi_struct *napi, >>>> break; >>>> >>>> case GRO_DROP: >>>> + atomic_long_inc(&skb->dev->rx_dropped); >>>> napi_reuse_skb(napi, skb); >>>> break; >>>> >>> >>> >>> This is not needed. I think we should clean up ice instead. >> >> My patch 2 already did that. I was trying to address the fact that I'm >> *actually seeing* GRO_DROP return codes coming back from stack. >> >> I'll try to reproduce that issue again that I saw. Maybe modern kernels >> don't have the problem as frequently or at all. > > > Jesse, you are sending a patch for current kernels. > > It is pretty clear that the issue you have can not happen with current > kernels, by reading the code source, > even without an actual ICE piece of hardware to test this :) > FWIW, I did some digging through the history to see what might have removed other possible GRO_DROP returns. I found this commit: 6570bc79c0df ("net: core: use listified Rx for GRO_NORMAL in napi_gro_receive()") It seems to have modified napi_skb_finish in such a way that it no longer reports GRO_DROP. I had trouble finding the other cases where GRO_DROP was removed, but I also am in favor of just removing it entirely at this point. ^ permalink raw reply [flat|nested] 16+ messages in thread
* [Intel-wired-lan] [PATCH net-next v1 2/2] ice: remove GRO drop accounting 2021-01-06 21:55 [Intel-wired-lan] [PATCH net-next v1 0/2] GRO drop accounting Jesse Brandeburg 2021-01-06 21:55 ` [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg @ 2021-01-06 21:55 ` Jesse Brandeburg 1 sibling, 0 replies; 16+ messages in thread From: Jesse Brandeburg @ 2021-01-06 21:55 UTC (permalink / raw) To: intel-wired-lan The driver was counting GRO drops but now that the stack does it with the previous patch, the driver can drop all the logic. The driver keeps the dev_dbg message in order to do optional detailed tracing. This mostly undoes commit a8fffd7ae9a5 ("ice: add useful statistics"). Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> --- drivers/net/ethernet/intel/ice/ice.h | 1 - drivers/net/ethernet/intel/ice/ice_ethtool.c | 1 - drivers/net/ethernet/intel/ice/ice_main.c | 4 +--- drivers/net/ethernet/intel/ice/ice_txrx.h | 1 - drivers/net/ethernet/intel/ice/ice_txrx_lib.c | 2 -- 5 files changed, 1 insertion(+), 8 deletions(-) diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index 56725356a17b..dde850045e7e 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -256,7 +256,6 @@ struct ice_vsi { u32 tx_busy; u32 rx_buf_failed; u32 rx_page_failed; - u32 rx_gro_dropped; u16 num_q_vectors; u16 base_vector; /* IRQ base for OS reserved vectors */ enum ice_vsi_type type; diff --git a/drivers/net/ethernet/intel/ice/ice_ethtool.c b/drivers/net/ethernet/intel/ice/ice_ethtool.c index 9e8e9531cd87..025c0a13e724 100644 --- a/drivers/net/ethernet/intel/ice/ice_ethtool.c +++ b/drivers/net/ethernet/intel/ice/ice_ethtool.c @@ -59,7 +59,6 @@ static const struct ice_stats ice_gstrings_vsi_stats[] = { ICE_VSI_STAT("rx_unknown_protocol", eth_stats.rx_unknown_protocol), ICE_VSI_STAT("rx_alloc_fail", rx_buf_failed), ICE_VSI_STAT("rx_pg_alloc_fail", rx_page_failed), - ICE_VSI_STAT("rx_gro_dropped", rx_gro_dropped), ICE_VSI_STAT("tx_errors", eth_stats.tx_errors), ICE_VSI_STAT("tx_linearize", tx_linearize), ICE_VSI_STAT("tx_busy", tx_busy), diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index c52b9bb0e3ab..e157a2b4fcb9 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -5314,7 +5314,6 @@ static void ice_update_vsi_ring_stats(struct ice_vsi *vsi) vsi->tx_linearize = 0; vsi->rx_buf_failed = 0; vsi->rx_page_failed = 0; - vsi->rx_gro_dropped = 0; rcu_read_lock(); @@ -5329,7 +5328,6 @@ static void ice_update_vsi_ring_stats(struct ice_vsi *vsi) vsi_stats->rx_bytes += bytes; vsi->rx_buf_failed += ring->rx_stats.alloc_buf_failed; vsi->rx_page_failed += ring->rx_stats.alloc_page_failed; - vsi->rx_gro_dropped += ring->rx_stats.gro_dropped; } /* update XDP Tx rings counters */ @@ -5361,7 +5359,7 @@ void ice_update_vsi_stats(struct ice_vsi *vsi) ice_update_eth_stats(vsi); cur_ns->tx_errors = cur_es->tx_errors; - cur_ns->rx_dropped = cur_es->rx_discards + vsi->rx_gro_dropped; + cur_ns->rx_dropped = cur_es->rx_discards; cur_ns->tx_dropped = cur_es->tx_discards; cur_ns->multicast = cur_es->rx_multicast; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index ff1a1cbd078e..6ce2046fc349 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -193,7 +193,6 @@ struct ice_rxq_stats { u64 non_eop_descs; u64 alloc_page_failed; u64 alloc_buf_failed; - u64 gro_dropped; /* GRO returned dropped */ }; /* this enum matches hardware bits and is meant to be used by DYN_CTLN diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index bc2f4390b51d..3601b7d8abe5 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -192,8 +192,6 @@ ice_receive_skb(struct ice_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag) (vlan_tag & VLAN_VID_MASK)) __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tag); if (napi_gro_receive(&rx_ring->q_vector->napi, skb) == GRO_DROP) { - /* this is tracked separately to help us debug stack drops */ - rx_ring->rx_stats.gro_dropped++; netdev_dbg(rx_ring->netdev, "Receive Queue %d: Dropped packet from GRO\n", rx_ring->q_index); } -- 2.29.2 ^ permalink raw reply related [flat|nested] 16+ messages in thread
end of thread, other threads:[~2021-01-14 13:53 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-01-06 21:55 [Intel-wired-lan] [PATCH net-next v1 0/2] GRO drop accounting Jesse Brandeburg 2021-01-06 21:55 ` [Intel-wired-lan] [PATCH net-next v1 1/2] net: core: count drops from GRO Jesse Brandeburg 2021-01-07 18:47 ` Jacob Keller 2021-01-07 21:15 ` Alexander Duyck 2021-01-08 18:23 ` Jesse Brandeburg 2021-01-08 0:50 ` Shannon Nelson 2021-01-08 18:26 ` Jesse Brandeburg 2021-01-08 19:21 ` Shannon Nelson 2021-01-08 20:26 ` Saeed Mahameed 2021-01-08 22:17 ` Eric Dumazet 2021-01-14 13:53 ` Jamal Hadi Salim 2021-01-08 9:25 ` Eric Dumazet 2021-01-08 18:35 ` Jesse Brandeburg 2021-01-08 18:45 ` Eric Dumazet 2021-01-09 0:54 ` Jacob Keller 2021-01-06 21:55 ` [Intel-wired-lan] [PATCH net-next v1 2/2] ice: remove GRO drop accounting Jesse Brandeburg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox