* [PATCH 2.6.30] Network Drop Monitor: Make use of consume_skb() in af_can.c
@ 2009-04-16 19:58 Oliver Hartkopp
2009-04-17 8:38 ` David Miller
0 siblings, 1 reply; 13+ messages in thread
From: Oliver Hartkopp @ 2009-04-16 19:58 UTC (permalink / raw)
To: David Miller, Neil Horman; +Cc: Linux Netdev List
[-- Attachment #1: Type: text/plain, Size: 470 bytes --]
Since commit ead2ceb0ec9f85cff19c43b5cdb2f8a054484431 so called end-of-line
points for skb's should use consume_skb() to free the socket buffer.
In opposite to consume_skb() the function kfree_skb() is intended to be used
for unexpected skb drops e.g. in error conditions that now can trigger the
network drop monitor if enabled.
This patch moves the skb end-of-line point in af_can.c to use consume_skb().
Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
---
[-- Attachment #2: can_consume_skb.patch --]
[-- Type: text/x-patch, Size: 438 bytes --]
diff --git a/net/can/af_can.c b/net/can/af_can.c
index 547bafc..10f0528 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -674,8 +674,8 @@ static int can_rcv(struct sk_buff *skb, struct net_device *dev,
rcu_read_unlock();
- /* free the skbuff allocated by the netdevice driver */
- kfree_skb(skb);
+ /* consume the skbuff allocated by the netdevice driver */
+ consume_skb(skb);
if (matches > 0) {
can_stats.matches++;
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 2.6.30] Network Drop Monitor: Make use of consume_skb() in af_can.c
2009-04-16 19:58 [PATCH 2.6.30] Network Drop Monitor: Make use of consume_skb() in af_can.c Oliver Hartkopp
@ 2009-04-17 8:38 ` David Miller
2009-04-17 8:56 ` [PATCH] loopback: packet drops accounting Eric Dumazet
0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2009-04-17 8:38 UTC (permalink / raw)
To: oliver; +Cc: nhorman, netdev
From: Oliver Hartkopp <oliver@hartkopp.net>
Date: Thu, 16 Apr 2009 21:58:29 +0200
> Since commit ead2ceb0ec9f85cff19c43b5cdb2f8a054484431 so called end-of-line
> points for skb's should use consume_skb() to free the socket buffer.
>
> In opposite to consume_skb() the function kfree_skb() is intended to be used
> for unexpected skb drops e.g. in error conditions that now can trigger the
> network drop monitor if enabled.
>
> This patch moves the skb end-of-line point in af_can.c to use consume_skb().
>
> Signed-off-by: Oliver Hartkopp <oliver@hartkopp.net>
Applied, thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] loopback: packet drops accounting
2009-04-17 8:38 ` David Miller
@ 2009-04-17 8:56 ` Eric Dumazet
2009-04-17 8:59 ` David Miller
0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2009-04-17 8:56 UTC (permalink / raw)
To: David Miller; +Cc: netdev
We can in some situations drop packets in netif_rx()
loopback driver does not report these (unlikely) drops to its stats,
and incorrectly change packets/bytes counts.
After this patch applied, "ifconfig lo" can reports these drops as in :
# ifconfig lo
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
these errors in rx_dropped/rx_errors.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
drivers/net/loopback.c | 21 +++++++++++++++------
1 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index b7d438a..a036296 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -62,6 +62,7 @@
struct pcpu_lstats {
unsigned long packets;
unsigned long bytes;
+ unsigned long drops;
};
/*
@@ -71,18 +72,22 @@ struct pcpu_lstats {
static int loopback_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct pcpu_lstats *pcpu_lstats, *lb_stats;
+ int len;
skb_orphan(skb);
- skb->protocol = eth_type_trans(skb,dev);
+ skb->protocol = eth_type_trans(skb, dev);
/* it's OK to use per_cpu_ptr() because BHs are off */
pcpu_lstats = dev->ml_priv;
lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
- lb_stats->bytes += skb->len;
- lb_stats->packets++;
- netif_rx(skb);
+ len = skb->len;
+ if (likely(netif_rx(skb) == NET_RX_SUCCESS)) {
+ lb_stats->bytes += len;
+ lb_stats->packets++;
+ } else
+ lb_stats->drops++;
return 0;
}
@@ -93,6 +98,7 @@ static struct net_device_stats *loopback_get_stats(struct net_device *dev)
struct net_device_stats *stats = &dev->stats;
unsigned long bytes = 0;
unsigned long packets = 0;
+ unsigned long drops = 0;
int i;
pcpu_lstats = dev->ml_priv;
@@ -102,11 +108,14 @@ static struct net_device_stats *loopback_get_stats(struct net_device *dev)
lb_stats = per_cpu_ptr(pcpu_lstats, i);
bytes += lb_stats->bytes;
packets += lb_stats->packets;
+ drops += lb_stats->drops;
}
stats->rx_packets = packets;
stats->tx_packets = packets;
- stats->rx_bytes = bytes;
- stats->tx_bytes = bytes;
+ stats->tx_dropped = drops;
+ stats->tx_errors = drops;
+ stats->rx_bytes = bytes;
+ stats->tx_bytes = bytes;
return stats;
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] loopback: packet drops accounting
2009-04-17 8:56 ` [PATCH] loopback: packet drops accounting Eric Dumazet
@ 2009-04-17 8:59 ` David Miller
2009-04-17 9:27 ` Eric Dumazet
0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2009-04-17 8:59 UTC (permalink / raw)
To: dada1; +Cc: netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Fri, 17 Apr 2009 10:56:57 +0200
> We can in some situations drop packets in netif_rx()
>
> loopback driver does not report these (unlikely) drops to its stats,
> and incorrectly change packets/bytes counts.
>
> After this patch applied, "ifconfig lo" can reports these drops as in :
>
> # ifconfig lo
> lo Link encap:Local Loopback
> inet addr:127.0.0.1 Mask:255.0.0.0
> UP LOOPBACK RUNNING MTU:16436 Metric:1
> RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
> TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
>
> I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
> these errors in rx_dropped/rx_errors.
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Well, logically the receive is what failed, not the transmit.
I think it's therefore misleading to count it as a TX drop.
Do you feel strongly about this?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] loopback: packet drops accounting
2009-04-17 8:59 ` David Miller
@ 2009-04-17 9:27 ` Eric Dumazet
2009-04-17 10:06 ` [PATCH] loopback: better handling of packet drops Eric Dumazet
2009-04-18 8:03 ` [PATCH] loopback: packet drops accounting Eric Dumazet
0 siblings, 2 replies; 13+ messages in thread
From: Eric Dumazet @ 2009-04-17 9:27 UTC (permalink / raw)
To: David Miller; +Cc: netdev
David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Fri, 17 Apr 2009 10:56:57 +0200
>
>> We can in some situations drop packets in netif_rx()
>>
>> loopback driver does not report these (unlikely) drops to its stats,
>> and incorrectly change packets/bytes counts.
>>
>> After this patch applied, "ifconfig lo" can reports these drops as in :
>>
>> # ifconfig lo
>> lo Link encap:Local Loopback
>> inet addr:127.0.0.1 Mask:255.0.0.0
>> UP LOOPBACK RUNNING MTU:16436 Metric:1
>> RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
>> TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
>> collisions:0 txqueuelen:0
>> RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
>>
>> I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
>> these errors in rx_dropped/rx_errors.
>>
>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>
> Well, logically the receive is what failed, not the transmit.
>
> I think it's therefore misleading to count it as a TX drop.
>
> Do you feel strongly about this?
Not at all, but my plan was to go a litle bit further, ie being able to
return from loopback_xmit() with a non null value.
netif_rx() usage in loopback device is biased, since its really a transmit :)
Oh well...
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] loopback: better handling of packet drops
2009-04-17 9:27 ` Eric Dumazet
@ 2009-04-17 10:06 ` Eric Dumazet
2009-04-17 10:33 ` Eric Dumazet
2009-04-18 8:03 ` [PATCH] loopback: packet drops accounting Eric Dumazet
1 sibling, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2009-04-17 10:06 UTC (permalink / raw)
To: David Miller; +Cc: netdev
Eric Dumazet a écrit :
> David Miller a écrit :
>> From: Eric Dumazet <dada1@cosmosbay.com>
>> Date: Fri, 17 Apr 2009 10:56:57 +0200
>>
>>> We can in some situations drop packets in netif_rx()
>>>
>>> loopback driver does not report these (unlikely) drops to its stats,
>>> and incorrectly change packets/bytes counts.
>>>
>>> After this patch applied, "ifconfig lo" can reports these drops as in :
>>>
>>> # ifconfig lo
>>> lo Link encap:Local Loopback
>>> inet addr:127.0.0.1 Mask:255.0.0.0
>>> UP LOOPBACK RUNNING MTU:16436 Metric:1
>>> RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
>>> collisions:0 txqueuelen:0
>>> RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
>>>
>>> I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
>>> these errors in rx_dropped/rx_errors.
>>>
>>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>> Well, logically the receive is what failed, not the transmit.
>>
>> I think it's therefore misleading to count it as a TX drop.
>>
>> Do you feel strongly about this?
>
> Not at all, but my plan was to go a litle bit further, ie being able to
> return from loopback_xmit() with a non null value.
>
Something like this :
[PATCH] loopback: better handling of packet drops
We can in some situations drop packets in netif_rx()
loopback driver does not report these (unlikely) drops to its stats,
and incorrectly change packets/bytes counts. Also upper layers are
not warned of these transmit failures.
After this patch applied, "ifconfig lo" can reports these drops as in :
# ifconfig lo
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
More over, loopback_xmit() can now return to its caller the indication that
packet was not transmitted for better queue management and error handling.
I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
them in rx_dropped/rx_errors.
Splitting netif_rx() with a helper function boosts tbench performance by 1%,
because we can avoid two tests (about netpoll and timestamping)
Tested with /proc/sys/net/core/netdev_max_backlog set to 0, tbench
can run at full speed even with some 'losses' on loopback. No more
tcp stalls...
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
drivers/net/loopback.c | 24 +++++++++---
include/linux/netdevice.h | 1
net/core/dev.c | 68 +++++++++++++++++++++++-------------
3 files changed, 62 insertions(+), 31 deletions(-)
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index b7d438a..a1308fd 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -62,6 +62,7 @@
struct pcpu_lstats {
unsigned long packets;
unsigned long bytes;
+ unsigned long drops;
};
/*
@@ -71,20 +72,25 @@ struct pcpu_lstats {
static int loopback_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct pcpu_lstats *pcpu_lstats, *lb_stats;
+ int len;
skb_orphan(skb);
- skb->protocol = eth_type_trans(skb,dev);
+ skb->protocol = eth_type_trans(skb, dev);
/* it's OK to use per_cpu_ptr() because BHs are off */
pcpu_lstats = dev->ml_priv;
lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
- lb_stats->bytes += skb->len;
- lb_stats->packets++;
- netif_rx(skb);
+ len = skb->len;
+ if (likely(__netif_rx(skb) == NET_RX_SUCCESS)) {
+ lb_stats->bytes += len;
+ lb_stats->packets++;
+ return 0;
+ }
+ lb_stats->drops++;
- return 0;
+ return 1;
}
static struct net_device_stats *loopback_get_stats(struct net_device *dev)
@@ -93,6 +99,7 @@ static struct net_device_stats *loopback_get_stats(struct net_device *dev)
struct net_device_stats *stats = &dev->stats;
unsigned long bytes = 0;
unsigned long packets = 0;
+ unsigned long drops = 0;
int i;
pcpu_lstats = dev->ml_priv;
@@ -102,11 +109,14 @@ static struct net_device_stats *loopback_get_stats(struct net_device *dev)
lb_stats = per_cpu_ptr(pcpu_lstats, i);
bytes += lb_stats->bytes;
packets += lb_stats->packets;
+ drops += lb_stats->drops;
}
stats->rx_packets = packets;
stats->tx_packets = packets;
- stats->rx_bytes = bytes;
- stats->tx_bytes = bytes;
+ stats->tx_dropped = drops;
+ stats->tx_errors = drops;
+ stats->rx_bytes = bytes;
+ stats->tx_bytes = bytes;
return stats;
}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2e7783f..c60e250 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1430,6 +1430,7 @@ extern void dev_kfree_skb_irq(struct sk_buff *skb);
extern void dev_kfree_skb_any(struct sk_buff *skb);
#define HAVE_NETIF_RX 1
+extern int __netif_rx(struct sk_buff *skb);
extern int netif_rx(struct sk_buff *skb);
extern int netif_rx_ni(struct sk_buff *skb);
#define HAVE_NETIF_RECEIVE_SKB 1
diff --git a/net/core/dev.c b/net/core/dev.c
index 343883f..8ae3f19 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1909,6 +1909,44 @@ int weight_p __read_mostly = 64; /* old backlog weight */
DEFINE_PER_CPU(struct netif_rx_stats, netdev_rx_stat) = { 0, };
+/*
+ * helper function called from netif_rx() or loopback_xmit()
+ */
+int __netif_rx(struct sk_buff *skb)
+{
+ struct softnet_data *queue;
+ unsigned long flags;
+
+ /*
+ * The code is rearranged so that the path is the most
+ * short when CPU is congested, but is still operating.
+ */
+ local_irq_save(flags);
+ queue = &__get_cpu_var(softnet_data);
+
+ __get_cpu_var(netdev_rx_stat).total++;
+ if (queue->input_pkt_queue.qlen <= netdev_max_backlog) {
+ if (queue->input_pkt_queue.qlen) {
+enqueue:
+ __skb_queue_tail(&queue->input_pkt_queue, skb);
+ local_irq_restore(flags);
+ return NET_RX_SUCCESS;
+ }
+
+ napi_schedule(&queue->backlog);
+ goto enqueue;
+ }
+
+ __get_cpu_var(netdev_rx_stat).dropped++;
+ local_irq_restore(flags);
+ /*
+ * Dont free skb here.
+ * netif_rx() will call kfree_skb(skb)
+ * loopback_xmit() will not free it but return an error to its caller
+ */
+ return NET_RX_DROP;
+}
+
/**
* netif_rx - post buffer to the network code
* @skb: buffer to post
@@ -1928,6 +1966,7 @@ int netif_rx(struct sk_buff *skb)
{
struct softnet_data *queue;
unsigned long flags;
+ int ret;
/* if netpoll wants it, pretend we never saw it */
if (netpoll_rx(skb))
@@ -1936,32 +1975,14 @@ int netif_rx(struct sk_buff *skb)
if (!skb->tstamp.tv64)
net_timestamp(skb);
- /*
- * The code is rearranged so that the path is the most
- * short when CPU is congested, but is still operating.
- */
- local_irq_save(flags);
- queue = &__get_cpu_var(softnet_data);
-
- __get_cpu_var(netdev_rx_stat).total++;
- if (queue->input_pkt_queue.qlen <= netdev_max_backlog) {
- if (queue->input_pkt_queue.qlen) {
-enqueue:
- __skb_queue_tail(&queue->input_pkt_queue, skb);
- local_irq_restore(flags);
- return NET_RX_SUCCESS;
- }
-
- napi_schedule(&queue->backlog);
- goto enqueue;
- }
+ ret = __netif_rx(skb);
- __get_cpu_var(netdev_rx_stat).dropped++;
- local_irq_restore(flags);
+ if (unlikely(ret == NET_RX_DROP))
+ kfree_skb(skb);
- kfree_skb(skb);
- return NET_RX_DROP;
+ return ret;
}
+EXPORT_SYMBOL(netif_rx);
int netif_rx_ni(struct sk_buff *skb)
{
@@ -5307,7 +5328,6 @@ EXPORT_SYMBOL(netdev_boot_setup_check);
EXPORT_SYMBOL(netdev_set_master);
EXPORT_SYMBOL(netdev_state_change);
EXPORT_SYMBOL(netif_receive_skb);
-EXPORT_SYMBOL(netif_rx);
EXPORT_SYMBOL(register_gifconf);
EXPORT_SYMBOL(register_netdevice);
EXPORT_SYMBOL(register_netdevice_notifier);
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] loopback: better handling of packet drops
2009-04-17 10:06 ` [PATCH] loopback: better handling of packet drops Eric Dumazet
@ 2009-04-17 10:33 ` Eric Dumazet
2009-04-17 10:51 ` David Miller
2009-04-17 14:58 ` Stephen Hemminger
0 siblings, 2 replies; 13+ messages in thread
From: Eric Dumazet @ 2009-04-17 10:33 UTC (permalink / raw)
To: David Miller; +Cc: netdev
Eric Dumazet a écrit :
> Eric Dumazet a écrit :
>> David Miller a écrit :
>>> From: Eric Dumazet <dada1@cosmosbay.com>
>>> Date: Fri, 17 Apr 2009 10:56:57 +0200
>>>
>>>> We can in some situations drop packets in netif_rx()
>>>>
>>>> loopback driver does not report these (unlikely) drops to its stats,
>>>> and incorrectly change packets/bytes counts.
>>>>
>>>> After this patch applied, "ifconfig lo" can reports these drops as in :
>>>>
>>>> # ifconfig lo
>>>> lo Link encap:Local Loopback
>>>> inet addr:127.0.0.1 Mask:255.0.0.0
>>>> UP LOOPBACK RUNNING MTU:16436 Metric:1
>>>> RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
>>>> TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
>>>> collisions:0 txqueuelen:0
>>>> RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
>>>>
>>>> I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
>>>> these errors in rx_dropped/rx_errors.
>>>>
>>>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>>> Well, logically the receive is what failed, not the transmit.
>>>
>>> I think it's therefore misleading to count it as a TX drop.
>>>
>>> Do you feel strongly about this?
>> Not at all, but my plan was to go a litle bit further, ie being able to
>> return from loopback_xmit() with a non null value.
>>
>
> Something like this :
I just noticed NETDEV_TX_BUSY & NETDEV_TX_OK, so here is an updated version
using these macros instead of 0 & 1
[PATCH] loopback: better handling of packet drops
We can in some situations drop packets in netif_rx()
loopback driver does not report these (unlikely) drops to its stats,
and incorrectly change packets/bytes counts. Also upper layers are
not warned of these transmit failures.
After this patch applied, "ifconfig lo" can reports these drops as in :
# ifconfig lo
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
More over, loopback_xmit() can now return to its caller the indication that
packet was not transmitted for better queue management and error handling.
I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
them in rx_dropped/rx_errors.
Splitting netif_rx() with a helper function boosts tbench performance by 1%,
because we can avoid two tests (about netpoll and timestamping)
Tested with /proc/sys/net/core/netdev_max_backlog set to 0, tbench
can run at full speed even with some 'losses' on loopback. No more
tcp stalls...
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
drivers/net/loopback.c | 24 +++++++++---
include/linux/netdevice.h | 1
net/core/dev.c | 68 +++++++++++++++++++++++-------------
3 files changed, 62 insertions(+), 31 deletions(-)
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index b7d438a..101a3bc 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -62,6 +62,7 @@
struct pcpu_lstats {
unsigned long packets;
unsigned long bytes;
+ unsigned long drops;
};
/*
@@ -71,20 +72,25 @@ struct pcpu_lstats {
static int loopback_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct pcpu_lstats *pcpu_lstats, *lb_stats;
+ int len;
skb_orphan(skb);
- skb->protocol = eth_type_trans(skb,dev);
+ skb->protocol = eth_type_trans(skb, dev);
/* it's OK to use per_cpu_ptr() because BHs are off */
pcpu_lstats = dev->ml_priv;
lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
- lb_stats->bytes += skb->len;
- lb_stats->packets++;
- netif_rx(skb);
+ len = skb->len;
+ if (likely(__netif_rx(skb) == NET_RX_SUCCESS)) {
+ lb_stats->bytes += len;
+ lb_stats->packets++;
+ return NETDEV_TX_OK;
+ }
+ lb_stats->drops++;
- return 0;
+ return NETDEV_TX_BUSY;
}
static struct net_device_stats *loopback_get_stats(struct net_device *dev)
@@ -93,6 +99,7 @@ static struct net_device_stats *loopback_get_stats(struct net_device *dev)
struct net_device_stats *stats = &dev->stats;
unsigned long bytes = 0;
unsigned long packets = 0;
+ unsigned long drops = 0;
int i;
pcpu_lstats = dev->ml_priv;
@@ -102,11 +109,14 @@ static struct net_device_stats *loopback_get_stats(struct net_device *dev)
lb_stats = per_cpu_ptr(pcpu_lstats, i);
bytes += lb_stats->bytes;
packets += lb_stats->packets;
+ drops += lb_stats->drops;
}
stats->rx_packets = packets;
stats->tx_packets = packets;
- stats->rx_bytes = bytes;
- stats->tx_bytes = bytes;
+ stats->tx_dropped = drops;
+ stats->tx_errors = drops;
+ stats->rx_bytes = bytes;
+ stats->tx_bytes = bytes;
return stats;
}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2e7783f..c60e250 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1430,6 +1430,7 @@ extern void dev_kfree_skb_irq(struct sk_buff *skb);
extern void dev_kfree_skb_any(struct sk_buff *skb);
#define HAVE_NETIF_RX 1
+extern int __netif_rx(struct sk_buff *skb);
extern int netif_rx(struct sk_buff *skb);
extern int netif_rx_ni(struct sk_buff *skb);
#define HAVE_NETIF_RECEIVE_SKB 1
diff --git a/net/core/dev.c b/net/core/dev.c
index 343883f..8ae3f19 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1909,6 +1909,44 @@ int weight_p __read_mostly = 64; /* old backlog weight */
DEFINE_PER_CPU(struct netif_rx_stats, netdev_rx_stat) = { 0, };
+/*
+ * helper function called from netif_rx() or loopback_xmit()
+ */
+int __netif_rx(struct sk_buff *skb)
+{
+ struct softnet_data *queue;
+ unsigned long flags;
+
+ /*
+ * The code is rearranged so that the path is the most
+ * short when CPU is congested, but is still operating.
+ */
+ local_irq_save(flags);
+ queue = &__get_cpu_var(softnet_data);
+
+ __get_cpu_var(netdev_rx_stat).total++;
+ if (queue->input_pkt_queue.qlen <= netdev_max_backlog) {
+ if (queue->input_pkt_queue.qlen) {
+enqueue:
+ __skb_queue_tail(&queue->input_pkt_queue, skb);
+ local_irq_restore(flags);
+ return NET_RX_SUCCESS;
+ }
+
+ napi_schedule(&queue->backlog);
+ goto enqueue;
+ }
+
+ __get_cpu_var(netdev_rx_stat).dropped++;
+ local_irq_restore(flags);
+ /*
+ * Dont free skb here.
+ * netif_rx() will call kfree_skb(skb)
+ * loopback_xmit() will not free it but return an error to its caller
+ */
+ return NET_RX_DROP;
+}
+
/**
* netif_rx - post buffer to the network code
* @skb: buffer to post
@@ -1928,6 +1966,7 @@ int netif_rx(struct sk_buff *skb)
{
struct softnet_data *queue;
unsigned long flags;
+ int ret;
/* if netpoll wants it, pretend we never saw it */
if (netpoll_rx(skb))
@@ -1936,32 +1975,14 @@ int netif_rx(struct sk_buff *skb)
if (!skb->tstamp.tv64)
net_timestamp(skb);
- /*
- * The code is rearranged so that the path is the most
- * short when CPU is congested, but is still operating.
- */
- local_irq_save(flags);
- queue = &__get_cpu_var(softnet_data);
-
- __get_cpu_var(netdev_rx_stat).total++;
- if (queue->input_pkt_queue.qlen <= netdev_max_backlog) {
- if (queue->input_pkt_queue.qlen) {
-enqueue:
- __skb_queue_tail(&queue->input_pkt_queue, skb);
- local_irq_restore(flags);
- return NET_RX_SUCCESS;
- }
-
- napi_schedule(&queue->backlog);
- goto enqueue;
- }
+ ret = __netif_rx(skb);
- __get_cpu_var(netdev_rx_stat).dropped++;
- local_irq_restore(flags);
+ if (unlikely(ret == NET_RX_DROP))
+ kfree_skb(skb);
- kfree_skb(skb);
- return NET_RX_DROP;
+ return ret;
}
+EXPORT_SYMBOL(netif_rx);
int netif_rx_ni(struct sk_buff *skb)
{
@@ -5307,7 +5328,6 @@ EXPORT_SYMBOL(netdev_boot_setup_check);
EXPORT_SYMBOL(netdev_set_master);
EXPORT_SYMBOL(netdev_state_change);
EXPORT_SYMBOL(netif_receive_skb);
-EXPORT_SYMBOL(netif_rx);
EXPORT_SYMBOL(register_gifconf);
EXPORT_SYMBOL(register_netdevice);
EXPORT_SYMBOL(register_netdevice_notifier);
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] loopback: better handling of packet drops
2009-04-17 10:33 ` Eric Dumazet
@ 2009-04-17 10:51 ` David Miller
2009-04-17 12:22 ` Eric Dumazet
2009-04-17 14:58 ` Stephen Hemminger
1 sibling, 1 reply; 13+ messages in thread
From: David Miller @ 2009-04-17 10:51 UTC (permalink / raw)
To: dada1; +Cc: netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Fri, 17 Apr 2009 12:33:33 +0200
> Splitting netif_rx() with a helper function boosts tbench
> performance by 1%, because we can avoid two tests (about netpoll and
> timestamping)
Loopback is not a special device no matter how much you wish
it might be :-)
This is why I haven't really pursued any further those patches I
showed you that treat local TCP connections specially, it just had the
realy possibility to break clever things people might be doing over
loopback using the packet scheduler classifier and packet scheduler
actions.
I also think it is valid to use netpoll over loopback, especially for
testing.
So please undo this part of the patch. You always try to combine
multiple distinct changes, and I would have taken just your TX drop
change if you hadn't added this __netif_rx() stuff to it :-(
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] loopback: better handling of packet drops
2009-04-17 10:51 ` David Miller
@ 2009-04-17 12:22 ` Eric Dumazet
0 siblings, 0 replies; 13+ messages in thread
From: Eric Dumazet @ 2009-04-17 12:22 UTC (permalink / raw)
To: David Miller; +Cc: netdev
David Miller a écrit :
> From: Eric Dumazet <dada1@cosmosbay.com>
> Date: Fri, 17 Apr 2009 12:33:33 +0200
>
>> Splitting netif_rx() with a helper function boosts tbench
>> performance by 1%, because we can avoid two tests (about netpoll and
>> timestamping)
>
> Loopback is not a special device no matter how much you wish
> it might be :-)
>
> This is why I haven't really pursued any further those patches I
> showed you that treat local TCP connections specially, it just had the
> realy possibility to break clever things people might be doing over
> loopback using the packet scheduler classifier and packet scheduler
> actions.
Point taken.
>
> I also think it is valid to use netpoll over loopback, especially for
> testing.
Oh I didnt knew it was possible/useful, sorry about that.
>
> So please undo this part of the patch. You always try to combine
> multiple distinct changes, and I would have taken just your TX drop
> change if you hadn't added this __netif_rx() stuff to it :-(
I followed on this patch to show what I had in mind, and why
I thought it was a transmit error more than a receive one.
1) Do you reject idea of splitting netif_rx() to be able to
not freeing skb in case of congestion ?
2) If not, do you want me to send two separate patches ?
3) Should I update rx_errors or tx_errors or both ?
Thank you
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] loopback: better handling of packet drops
2009-04-17 10:33 ` Eric Dumazet
2009-04-17 10:51 ` David Miller
@ 2009-04-17 14:58 ` Stephen Hemminger
2009-04-17 15:05 ` Eric Dumazet
1 sibling, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2009-04-17 14:58 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
On Fri, 17 Apr 2009 12:33:33 +0200
Eric Dumazet <dada1@cosmosbay.com> wrote:
> Eric Dumazet a écrit :
> > Eric Dumazet a écrit :
> >> David Miller a écrit :
> >>> From: Eric Dumazet <dada1@cosmosbay.com>
> >>> Date: Fri, 17 Apr 2009 10:56:57 +0200
> >>>
> >>>> We can in some situations drop packets in netif_rx()
> >>>>
> >>>> loopback driver does not report these (unlikely) drops to its stats,
> >>>> and incorrectly change packets/bytes counts.
> >>>>
> >>>> After this patch applied, "ifconfig lo" can reports these drops as in :
> >>>>
> >>>> # ifconfig lo
> >>>> lo Link encap:Local Loopback
> >>>> inet addr:127.0.0.1 Mask:255.0.0.0
> >>>> UP LOOPBACK RUNNING MTU:16436 Metric:1
> >>>> RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
> >>>> TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
> >>>> collisions:0 txqueuelen:0
> >>>> RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
> >>>>
> >>>> I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
> >>>> these errors in rx_dropped/rx_errors.
> >>>>
> >>>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> >>> Well, logically the receive is what failed, not the transmit.
> >>>
> >>> I think it's therefore misleading to count it as a TX drop.
> >>>
> >>> Do you feel strongly about this?
> >> Not at all, but my plan was to go a litle bit further, ie being able to
> >> return from loopback_xmit() with a non null value.
> >>
> >
> > Something like this :
>
> I just noticed NETDEV_TX_BUSY & NETDEV_TX_OK, so here is an updated version
> using these macros instead of 0 & 1
>
> [PATCH] loopback: better handling of packet drops
>
> We can in some situations drop packets in netif_rx()
>
> loopback driver does not report these (unlikely) drops to its stats,
> and incorrectly change packets/bytes counts. Also upper layers are
> not warned of these transmit failures.
>
> After this patch applied, "ifconfig lo" can reports these drops as in :
>
> # ifconfig lo
> lo Link encap:Local Loopback
> inet addr:127.0.0.1 Mask:255.0.0.0
> UP LOOPBACK RUNNING MTU:16436 Metric:1
> RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
> TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
>
> More over, loopback_xmit() can now return to its caller the indication that
> packet was not transmitted for better queue management and error handling.
>
> I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
> them in rx_dropped/rx_errors.
>
> Splitting netif_rx() with a helper function boosts tbench performance by 1%,
> because we can avoid two tests (about netpoll and timestamping)
>
> Tested with /proc/sys/net/core/netdev_max_backlog set to 0, tbench
> can run at full speed even with some 'losses' on loopback. No more
> tcp stalls...
>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> ---
> drivers/net/loopback.c | 24 +++++++++---
> include/linux/netdevice.h | 1
> net/core/dev.c | 68 +++++++++++++++++++++++-------------
> 3 files changed, 62 insertions(+), 31 deletions(-)
>
> diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
> index b7d438a..101a3bc 100644
> --- a/drivers/net/loopback.c
> +++ b/drivers/net/loopback.c
> @@ -62,6 +62,7 @@
> struct pcpu_lstats {
> unsigned long packets;
> unsigned long bytes;
> + unsigned long drops;
> };
>
> /*
> @@ -71,20 +72,25 @@ struct pcpu_lstats {
> static int loopback_xmit(struct sk_buff *skb, struct net_device *dev)
> {
> struct pcpu_lstats *pcpu_lstats, *lb_stats;
> + int len;
>
> skb_orphan(skb);
>
> - skb->protocol = eth_type_trans(skb,dev);
> + skb->protocol = eth_type_trans(skb, dev);
>
> /* it's OK to use per_cpu_ptr() because BHs are off */
> pcpu_lstats = dev->ml_priv;
> lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
> - lb_stats->bytes += skb->len;
> - lb_stats->packets++;
>
> - netif_rx(skb);
> + len = skb->len;
> + if (likely(__netif_rx(skb) == NET_RX_SUCCESS)) {
> + lb_stats->bytes += len;
> + lb_stats->packets++;
> + return NETDEV_TX_OK;
> + }
> + lb_stats->drops++;
>
> - return 0;
> + return NETDEV_TX_BUSY;
> }
If you return NETDEV_TX_BUSY, then the xmit logic will retry
so it is not really a drop but a stall. I think it is confusing
to call this a packet loss.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] loopback: better handling of packet drops
2009-04-17 14:58 ` Stephen Hemminger
@ 2009-04-17 15:05 ` Eric Dumazet
0 siblings, 0 replies; 13+ messages in thread
From: Eric Dumazet @ 2009-04-17 15:05 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David Miller, netdev
Stephen Hemminger a écrit :
> On Fri, 17 Apr 2009 12:33:33 +0200
> Eric Dumazet <dada1@cosmosbay.com> wrote:
>> static int loopback_xmit(struct sk_buff *skb, struct net_device *dev)
>> {
>> struct pcpu_lstats *pcpu_lstats, *lb_stats;
>> + int len;
>>
>> skb_orphan(skb);
>>
>> - skb->protocol = eth_type_trans(skb,dev);
>> + skb->protocol = eth_type_trans(skb, dev);
>>
>> /* it's OK to use per_cpu_ptr() because BHs are off */
>> pcpu_lstats = dev->ml_priv;
>> lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
>> - lb_stats->bytes += skb->len;
>> - lb_stats->packets++;
>>
>> - netif_rx(skb);
>> + len = skb->len;
>> + if (likely(__netif_rx(skb) == NET_RX_SUCCESS)) {
>> + lb_stats->bytes += len;
>> + lb_stats->packets++;
>> + return NETDEV_TX_OK;
>> + }
>> + lb_stats->drops++;
>>
>> - return 0;
>> + return NETDEV_TX_BUSY;
>> }
>
> If you return NETDEV_TX_BUSY, then the xmit logic will retry
> so it is not really a drop but a stall. I think it is confusing
> to call this a packet loss.
Good point, thanks.
So we should not account this stall in dev stats ? Maybe in 'collisions' ?
I also discovered we had to do
skb_push(skb, ETH_HLEN); /* undo the skb_pull() done in eth_type_trans() */
before returning NETDEV_TX_BUSY;
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] loopback: packet drops accounting
2009-04-17 9:27 ` Eric Dumazet
2009-04-17 10:06 ` [PATCH] loopback: better handling of packet drops Eric Dumazet
@ 2009-04-18 8:03 ` Eric Dumazet
2009-04-20 9:26 ` David Miller
1 sibling, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2009-04-18 8:03 UTC (permalink / raw)
To: David Miller; +Cc: netdev
Eric Dumazet a écrit :
> David Miller a écrit :
>> From: Eric Dumazet <dada1@cosmosbay.com>
>> Date: Fri, 17 Apr 2009 10:56:57 +0200
>>
>>> We can in some situations drop packets in netif_rx()
>>>
>>> loopback driver does not report these (unlikely) drops to its stats,
>>> and incorrectly change packets/bytes counts.
>>>
>>> After this patch applied, "ifconfig lo" can reports these drops as in :
>>>
>>> # ifconfig lo
>>> lo Link encap:Local Loopback
>>> inet addr:127.0.0.1 Mask:255.0.0.0
>>> UP LOOPBACK RUNNING MTU:16436 Metric:1
>>> RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
>>> collisions:0 txqueuelen:0
>>> RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
>>>
>>> I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
>>> these errors in rx_dropped/rx_errors.
>>>
>>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>> Well, logically the receive is what failed, not the transmit.
>>
>> I think it's therefore misleading to count it as a TX drop.
>>
>> Do you feel strongly about this?
>
Hi David
You were right :)
Considering that loopbak_xmit() calls eth_type_trans(skb, dev) and
this function already starts the RX handling of the packet (skb_pull(...))
So, trying to make loopback_xmit() returns NETDEV_TX_BUSY is wrong too,
or too complex, because we would have to undo all changes that might
happened during failed xmit. This would be hard to maintain and
over-engineering at least, given that these drops are very unlikely.
So I resubmit my initial patch, changing the errors to be reported on the
rx stats.
Thanks
[PATCH] loopback: packet drops accounting
We can in some situations drop packets in netif_rx()
loopback driver does not report these (unlikely) drops to its stats,
and incorrectly change packets/bytes counts.
After this patch applied, "ifconfig lo" can reports these drops as in :
# ifconfig lo
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:692562900 errors:3228 dropped:3228 overruns:0 frame:0
TX packets:692562900 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
I initialy chose to reflect those errors only in tx_dropped/tx_errors, but David
convinced me that it was really RX errors, as loopback_xmit() really starts
a RX process. (calling eth_type_trans() for example, that itself pulls the ethernet header)
These errors are accounted in rx_dropped/rx_errors.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
drivers/net/loopback.c | 21 +++++++++++++++------
1 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index b7d438a..a036296 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -62,6 +62,7 @@
struct pcpu_lstats {
unsigned long packets;
unsigned long bytes;
+ unsigned long drops;
};
/*
@@ -71,18 +72,22 @@ struct pcpu_lstats {
static int loopback_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct pcpu_lstats *pcpu_lstats, *lb_stats;
+ int len;
skb_orphan(skb);
- skb->protocol = eth_type_trans(skb,dev);
+ skb->protocol = eth_type_trans(skb, dev);
/* it's OK to use per_cpu_ptr() because BHs are off */
pcpu_lstats = dev->ml_priv;
lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
- lb_stats->bytes += skb->len;
- lb_stats->packets++;
- netif_rx(skb);
+ len = skb->len;
+ if (likely(netif_rx(skb) == NET_RX_SUCCESS)) {
+ lb_stats->bytes += len;
+ lb_stats->packets++;
+ } else
+ lb_stats->drops++;
return 0;
}
@@ -93,6 +98,7 @@ static struct net_device_stats *loopback_get_stats(struct net_device *dev)
struct net_device_stats *stats = &dev->stats;
unsigned long bytes = 0;
unsigned long packets = 0;
+ unsigned long drops = 0;
int i;
pcpu_lstats = dev->ml_priv;
@@ -102,11 +108,14 @@ static struct net_device_stats *loopback_get_stats(struct net_device *dev)
lb_stats = per_cpu_ptr(pcpu_lstats, i);
bytes += lb_stats->bytes;
packets += lb_stats->packets;
+ drops += lb_stats->drops;
}
stats->rx_packets = packets;
stats->tx_packets = packets;
- stats->rx_bytes = bytes;
- stats->tx_bytes = bytes;
+ stats->rx_dropped = drops;
+ stats->rx_errors = drops;
+ stats->rx_bytes = bytes;
+ stats->tx_bytes = bytes;
return stats;
}
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] loopback: packet drops accounting
2009-04-18 8:03 ` [PATCH] loopback: packet drops accounting Eric Dumazet
@ 2009-04-20 9:26 ` David Miller
0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2009-04-20 9:26 UTC (permalink / raw)
To: dada1; +Cc: netdev
From: Eric Dumazet <dada1@cosmosbay.com>
Date: Sat, 18 Apr 2009 10:03:10 +0200
> You were right :)
There is a first time for everything :)
> [PATCH] loopback: packet drops accounting
Applied, thanks Eric!
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-04-20 9:26 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-16 19:58 [PATCH 2.6.30] Network Drop Monitor: Make use of consume_skb() in af_can.c Oliver Hartkopp
2009-04-17 8:38 ` David Miller
2009-04-17 8:56 ` [PATCH] loopback: packet drops accounting Eric Dumazet
2009-04-17 8:59 ` David Miller
2009-04-17 9:27 ` Eric Dumazet
2009-04-17 10:06 ` [PATCH] loopback: better handling of packet drops Eric Dumazet
2009-04-17 10:33 ` Eric Dumazet
2009-04-17 10:51 ` David Miller
2009-04-17 12:22 ` Eric Dumazet
2009-04-17 14:58 ` Stephen Hemminger
2009-04-17 15:05 ` Eric Dumazet
2009-04-18 8:03 ` [PATCH] loopback: packet drops accounting Eric Dumazet
2009-04-20 9:26 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).