From: Eric Dumazet <dada1@cosmosbay.com>
To: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Subject: [PATCH] loopback: better handling of packet drops
Date: Fri, 17 Apr 2009 12:06:28 +0200 [thread overview]
Message-ID: <49E854A4.1010204@cosmosbay.com> (raw)
In-Reply-To: <49E84B99.1080502@cosmosbay.com>
Eric Dumazet a écrit :
> David Miller a écrit :
>> From: Eric Dumazet <dada1@cosmosbay.com>
>> Date: Fri, 17 Apr 2009 10:56:57 +0200
>>
>>> We can in some situations drop packets in netif_rx()
>>>
>>> loopback driver does not report these (unlikely) drops to its stats,
>>> and incorrectly change packets/bytes counts.
>>>
>>> After this patch applied, "ifconfig lo" can reports these drops as in :
>>>
>>> # ifconfig lo
>>> lo Link encap:Local Loopback
>>> inet addr:127.0.0.1 Mask:255.0.0.0
>>> UP LOOPBACK RUNNING MTU:16436 Metric:1
>>> RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
>>> TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
>>> collisions:0 txqueuelen:0
>>> RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
>>>
>>> I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
>>> these errors in rx_dropped/rx_errors.
>>>
>>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
>> Well, logically the receive is what failed, not the transmit.
>>
>> I think it's therefore misleading to count it as a TX drop.
>>
>> Do you feel strongly about this?
>
> Not at all, but my plan was to go a litle bit further, ie being able to
> return from loopback_xmit() with a non null value.
>
Something like this :
[PATCH] loopback: better handling of packet drops
We can in some situations drop packets in netif_rx()
loopback driver does not report these (unlikely) drops to its stats,
and incorrectly change packets/bytes counts. Also upper layers are
not warned of these transmit failures.
After this patch applied, "ifconfig lo" can reports these drops as in :
# ifconfig lo
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:692562900 errors:0 dropped:0 overruns:0 frame:0
TX packets:692562900 errors:3228 dropped:3228 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:2865674174 (2.6 GiB) TX bytes:2865674174 (2.6 GiB)
More over, loopback_xmit() can now return to its caller the indication that
packet was not transmitted for better queue management and error handling.
I chose to reflect those errors only in tx_dropped/tx_errors, and not mirror
them in rx_dropped/rx_errors.
Splitting netif_rx() with a helper function boosts tbench performance by 1%,
because we can avoid two tests (about netpoll and timestamping)
Tested with /proc/sys/net/core/netdev_max_backlog set to 0, tbench
can run at full speed even with some 'losses' on loopback. No more
tcp stalls...
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
drivers/net/loopback.c | 24 +++++++++---
include/linux/netdevice.h | 1
net/core/dev.c | 68 +++++++++++++++++++++++-------------
3 files changed, 62 insertions(+), 31 deletions(-)
diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index b7d438a..a1308fd 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -62,6 +62,7 @@
struct pcpu_lstats {
unsigned long packets;
unsigned long bytes;
+ unsigned long drops;
};
/*
@@ -71,20 +72,25 @@ struct pcpu_lstats {
static int loopback_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct pcpu_lstats *pcpu_lstats, *lb_stats;
+ int len;
skb_orphan(skb);
- skb->protocol = eth_type_trans(skb,dev);
+ skb->protocol = eth_type_trans(skb, dev);
/* it's OK to use per_cpu_ptr() because BHs are off */
pcpu_lstats = dev->ml_priv;
lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
- lb_stats->bytes += skb->len;
- lb_stats->packets++;
- netif_rx(skb);
+ len = skb->len;
+ if (likely(__netif_rx(skb) == NET_RX_SUCCESS)) {
+ lb_stats->bytes += len;
+ lb_stats->packets++;
+ return 0;
+ }
+ lb_stats->drops++;
- return 0;
+ return 1;
}
static struct net_device_stats *loopback_get_stats(struct net_device *dev)
@@ -93,6 +99,7 @@ static struct net_device_stats *loopback_get_stats(struct net_device *dev)
struct net_device_stats *stats = &dev->stats;
unsigned long bytes = 0;
unsigned long packets = 0;
+ unsigned long drops = 0;
int i;
pcpu_lstats = dev->ml_priv;
@@ -102,11 +109,14 @@ static struct net_device_stats *loopback_get_stats(struct net_device *dev)
lb_stats = per_cpu_ptr(pcpu_lstats, i);
bytes += lb_stats->bytes;
packets += lb_stats->packets;
+ drops += lb_stats->drops;
}
stats->rx_packets = packets;
stats->tx_packets = packets;
- stats->rx_bytes = bytes;
- stats->tx_bytes = bytes;
+ stats->tx_dropped = drops;
+ stats->tx_errors = drops;
+ stats->rx_bytes = bytes;
+ stats->tx_bytes = bytes;
return stats;
}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2e7783f..c60e250 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1430,6 +1430,7 @@ extern void dev_kfree_skb_irq(struct sk_buff *skb);
extern void dev_kfree_skb_any(struct sk_buff *skb);
#define HAVE_NETIF_RX 1
+extern int __netif_rx(struct sk_buff *skb);
extern int netif_rx(struct sk_buff *skb);
extern int netif_rx_ni(struct sk_buff *skb);
#define HAVE_NETIF_RECEIVE_SKB 1
diff --git a/net/core/dev.c b/net/core/dev.c
index 343883f..8ae3f19 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1909,6 +1909,44 @@ int weight_p __read_mostly = 64; /* old backlog weight */
DEFINE_PER_CPU(struct netif_rx_stats, netdev_rx_stat) = { 0, };
+/*
+ * helper function called from netif_rx() or loopback_xmit()
+ */
+int __netif_rx(struct sk_buff *skb)
+{
+ struct softnet_data *queue;
+ unsigned long flags;
+
+ /*
+ * The code is rearranged so that the path is the most
+ * short when CPU is congested, but is still operating.
+ */
+ local_irq_save(flags);
+ queue = &__get_cpu_var(softnet_data);
+
+ __get_cpu_var(netdev_rx_stat).total++;
+ if (queue->input_pkt_queue.qlen <= netdev_max_backlog) {
+ if (queue->input_pkt_queue.qlen) {
+enqueue:
+ __skb_queue_tail(&queue->input_pkt_queue, skb);
+ local_irq_restore(flags);
+ return NET_RX_SUCCESS;
+ }
+
+ napi_schedule(&queue->backlog);
+ goto enqueue;
+ }
+
+ __get_cpu_var(netdev_rx_stat).dropped++;
+ local_irq_restore(flags);
+ /*
+ * Dont free skb here.
+ * netif_rx() will call kfree_skb(skb)
+ * loopback_xmit() will not free it but return an error to its caller
+ */
+ return NET_RX_DROP;
+}
+
/**
* netif_rx - post buffer to the network code
* @skb: buffer to post
@@ -1928,6 +1966,7 @@ int netif_rx(struct sk_buff *skb)
{
struct softnet_data *queue;
unsigned long flags;
+ int ret;
/* if netpoll wants it, pretend we never saw it */
if (netpoll_rx(skb))
@@ -1936,32 +1975,14 @@ int netif_rx(struct sk_buff *skb)
if (!skb->tstamp.tv64)
net_timestamp(skb);
- /*
- * The code is rearranged so that the path is the most
- * short when CPU is congested, but is still operating.
- */
- local_irq_save(flags);
- queue = &__get_cpu_var(softnet_data);
-
- __get_cpu_var(netdev_rx_stat).total++;
- if (queue->input_pkt_queue.qlen <= netdev_max_backlog) {
- if (queue->input_pkt_queue.qlen) {
-enqueue:
- __skb_queue_tail(&queue->input_pkt_queue, skb);
- local_irq_restore(flags);
- return NET_RX_SUCCESS;
- }
-
- napi_schedule(&queue->backlog);
- goto enqueue;
- }
+ ret = __netif_rx(skb);
- __get_cpu_var(netdev_rx_stat).dropped++;
- local_irq_restore(flags);
+ if (unlikely(ret == NET_RX_DROP))
+ kfree_skb(skb);
- kfree_skb(skb);
- return NET_RX_DROP;
+ return ret;
}
+EXPORT_SYMBOL(netif_rx);
int netif_rx_ni(struct sk_buff *skb)
{
@@ -5307,7 +5328,6 @@ EXPORT_SYMBOL(netdev_boot_setup_check);
EXPORT_SYMBOL(netdev_set_master);
EXPORT_SYMBOL(netdev_state_change);
EXPORT_SYMBOL(netif_receive_skb);
-EXPORT_SYMBOL(netif_rx);
EXPORT_SYMBOL(register_gifconf);
EXPORT_SYMBOL(register_netdevice);
EXPORT_SYMBOL(register_netdevice_notifier);
next prev parent reply other threads:[~2009-04-17 10:06 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-16 19:58 [PATCH 2.6.30] Network Drop Monitor: Make use of consume_skb() in af_can.c Oliver Hartkopp
2009-04-17 8:38 ` David Miller
2009-04-17 8:56 ` [PATCH] loopback: packet drops accounting Eric Dumazet
2009-04-17 8:59 ` David Miller
2009-04-17 9:27 ` Eric Dumazet
2009-04-17 10:06 ` Eric Dumazet [this message]
2009-04-17 10:33 ` [PATCH] loopback: better handling of packet drops Eric Dumazet
2009-04-17 10:51 ` David Miller
2009-04-17 12:22 ` Eric Dumazet
2009-04-17 14:58 ` Stephen Hemminger
2009-04-17 15:05 ` Eric Dumazet
2009-04-18 8:03 ` [PATCH] loopback: packet drops accounting Eric Dumazet
2009-04-20 9:26 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49E854A4.1010204@cosmosbay.com \
--to=dada1@cosmosbay.com \
--cc=davem@davemloft.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).