* Re: [PATCH] gigaset: include cleanup cleanup
From: Karsten Keil @ 2010-04-18 13:31 UTC (permalink / raw)
To: Tejun Heo
Cc: Tilman Schmidt, David Miller, Hansjoerg Lipp, i4ldeveloper,
netdev, linux-kernel
In-Reply-To: <4BCA6B8E.9000408@kernel.org>
On Sonntag, 18. April 2010 04:16:46 Tejun Heo wrote:
> Hello,
>
> On 04/17/2010 07:08 AM, Tilman Schmidt wrote:
> > Commit 5a0e3ad causes slab.h to be included twice in many of the
> > Gigaset driver's source files, first via the common include file
> > gigaset.h and then a second time directly. Drop the spares, and
> > use the opportunity to clean up a few more similar cases.
> >
> > Impact: cleanup, no functional change
> > Signed-off-by: Tilman Schmidt <tilman@imap.cc>
> > CC: Tejun Heo <tj@kernel.org>
>
> Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Karsten Keil <isdn@linux-pingi.de>
>
> Thanks for the clean up.
>
> > Seeing that the "include cleanup" patch triggering this was accepted
> > after the merge window, I have hopes this one will be accepted, too.
>
> Hmm... through which tree should this go through? I can route it
> through percpu but maybe taking the usual isdn patch path would be
> better?
>
I think David Miller will take it.
Karsten
^ permalink raw reply
* Re: [PATCH 6/6] X25: Use identifiers for hdlc x25 device to x25 interface
From: Krzysztof Halasa @ 2010-04-18 12:39 UTC (permalink / raw)
To: Andrew Hendry; +Cc: netdev
In-Reply-To: <1271584429.6280.435.camel@ibex>
Andrew Hendry <andrew.hendry@gmail.com> writes:
> Change magic numbers to identifiers for X25 interface.
Looks good to me, ack.
> Note, x25_connect_disconnect 'reason' appears unimplemented?
To be honest, I don't know this stuff at all. I've added it years ago to
the point some x25 telnet (or something like that) connected to some
server, then I had a successful confirmation from somebody. Personally
I don't even enable X.25 in my tests. I don't know if someone is still
using this code.
--
Krzysztof Halasa
^ permalink raw reply
* SNMP OutOctets counter semantics
From: Mattias Rönnblom @ 2010-04-18 12:16 UTC (permalink / raw)
To: netdev
Hi,
after having a look at the SNMP counters (exposed in /proc/net/snmp
among other places), I have a question on the meaning of the
"OutOctets" counter. I'm looking at 2.6.33.
According to include/linux/snmp.h IPSTATS_MIB_OUTPKTS is
"OutRequests". The same file refers to
"draft-ietf-ipv6-rfc2011-update-10.txt", which I believe is what
became RFC 4293. According to both of those standard documents,
"OutRequest" counts IP packets coming from upper (sub-)layers (a
transport layer, ICMP etc) into the IP layer.
This corresponds well with how this counter is actually incremented in
the code, as far as I can tell.
In 2.6.31, a corresponding byte counter "OutOctets" was introduced,
which in the Linux kernel counted the same packets as "OutRequest",
but was a byte counter.
In RFC 4293 (and its draft) there is indeed a OutOctets, but this is a
byte counter for packets _leaving_ the IP layer into the link layer.
This counter in the standard corresponds to the "OutTransmits" packet
counter, which is not implemented in the Linux kernel.
My question is: is this simply a bug, or does the kernel draw its
counter semantics from some other standard?
Best regards,
Mattias
^ permalink raw reply
* Re: rps perfomance WAS(Re: rps: question
From: Eric Dumazet @ 2010-04-18 11:34 UTC (permalink / raw)
To: hadi; +Cc: Changli Gao, Rick Jones, David Miller, therbert, netdev, robert,
andi
In-Reply-To: <1271583573.16881.4798.camel@edumazet-laptop>
Le dimanche 18 avril 2010 à 11:39 +0200, Eric Dumazet a écrit :
> No, only one packet per IPI, since I setup my tg3 coalescing parameter
> to the minimum value, I received one packet per interrupt.
>
> The specific app is :
>
> for f in `seq 1 8`; do while :; do :; done& done
>
An other interesting user land app would be to use a cpu _and_ memory
cruncher, because of caches misses we'll get.
$ cat nloop.c
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define SZ 4*1024*1024
int main(int argc, char *argv[])
{
int nproc = 8;
char *buffer;
if (argc > 1)
nproc = atoi(argv[1]);
while (nproc > 1) {
if (fork() == 0)
break;
nproc--;
}
buffer = malloc(SZ);
while (1)
memset(buffer, 0x55, SZ);
}
$ ./nloop 8 &
echo 00 >/sys/class/net/eth3/queues/rx-0/rps_cpus
4861ms
echo 01 >/sys/class/net/eth3/queues/rx-0/rps_cpus
4981ms
echo 02 >/sys/class/net/eth3/queues/rx-0/rps_cpus
7191ms
echo 04 >/sys/class/net/eth3/queues/rx-0/rps_cpus
7128ms
echo 08 >/sys/class/net/eth3/queues/rx-0/rps_cpus
7107ms
echo 10 >/sys/class/net/eth3/queues/rx-0/rps_cpus
5505ms
echo 20 >/sys/class/net/eth3/queues/rx-0/rps_cpus
7125ms
echo 40 >/sys/class/net/eth3/queues/rx-0/rps_cpus
7022ms
echo 80 >/sys/class/net/eth3/queues/rx-0/rps_cpus
7157ms
Maximum overhead is 7191-4861 = 23.3 us per packet
^ permalink raw reply
* Re: [PATCH 1/6] X25: Use identifiers for X25 to device interface
From: John Hughes @ 2010-04-18 11:08 UTC (permalink / raw)
To: John Hughes; +Cc: Andrew Hendry, netdev
In-Reply-To: <4BCAE509.9000209@Calva.COM>
John Hughes wrote:
> Shouldn't you use explicit values here?
>
> enum {
> X25_IFACE_DATA = 0x00, /* explicit value for ABI stability */
> ...
Oh, and is net/x25device.h suitable for inclusion from user space (xotd
for example)?
^ permalink raw reply
* Re: [PATCH 1/6] X25: Use identifiers for X25 to device interface
From: John Hughes @ 2010-04-18 10:55 UTC (permalink / raw)
To: Andrew Hendry; +Cc: netdev
In-Reply-To: <1271584310.6280.425.camel@ibex>
Andrew Hendry wrote:
> Use identifiers in x25_device.h instead of magic numbers for X25 layer 3 to device interface.
> Also fixed checkpatch notes on updated code.
> [...]
>
> -First Byte = 0x00
> +First Byte = 0x00 (X25_IFACE_DATA)
> [...]
> +
> +enum {
> + X25_IFACE_DATA,
> + X25_IFACE_CONNECT,
> + X25_IFACE_DISCONNECT,
> + X25_IFACE_PARAMS
> +};
>
Shouldn't you use explicit values here?
enum {
X25_IFACE_DATA = 0x00, /* explicit value for ABI stability */
...
^ permalink raw reply
* Re: [PATCH v5] rfs: Receive Flow Steering
From: Franco Fichtner @ 2010-04-18 11:06 UTC (permalink / raw)
To: Changli Gao; +Cc: Tom Herbert, Eric Dumazet, David Miller, netdev
In-Reply-To: <h2q412e6f7f1004171706w905cc5c8u51f82cad221642f4@mail.gmail.com>
Changli Gao wrote:
> On Sun, Apr 18, 2010 at 1:38 AM, Tom Herbert <therbert@google.com> wrote:
>
>> That's cool!, but I still like the idea that this hash is treated as
>> an opaque value getting the hash from the device to avoid the jhash
>> or cache misses on the packet can also be a win... Maybe connection
>> tracking/firewall could use the skb->rxhash which provides the
>> consistency and also eliminates the need to do more jhashes.
>>
>>
>
> consistent rxhash only adds the risk of the hash collision, and I
> don't think it is a big problem. For connection tracking/firewall use,
> I am afraid that we have to recompute this value after defrag. So we
> have to export the hash function we used in RPS.
>
> As NIC's hash function can be changed dynamically, the rxhash isn't
> consistent, so the rxhash can't be used by connection tracking, socket
> lookup and others come later.
>
>
I have to agree with Eric and Changli here.
It's especially true if you're passively tracking via one NIC, where all
traffic is just forwarded.
In this scenario, you need to compute consistent hashes. rxhashes by NIC
will be different for
"incoming" and "outgoing" traffic...
Where rxhash by NIC can be used (note: didn't say _useful_) are
scenarios with different net
ports for incoming and outgoing traffic (in active but also passive
traffic scenarios). Here,
rxhashes could be used on a per-port basis, but associating two
seemingly separate rxhashes
with one another to match CPUs is a really annoying task. This would
involve computing the
corresponding "txhash" and looking it up, which is what we'd be doing
with the jhash anyway.
For proper flow tracking Eric's suggestion is the way to go. And if
there are worries about
collisions, why not add IPPROTO_* to the mix.
Franco
^ permalink raw reply
* [PATCH 6/6] X25: Use identifiers for hdlc x25 device to x25 interface
From: Andrew Hendry @ 2010-04-18 9:53 UTC (permalink / raw)
To: netdev, khc
Change magic numbers to identifiers for X25 interface.
Note, x25_connect_disconnect 'reason' appears unimplemented?
Have left that part as is for the moment.
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
---
drivers/net/wan/hdlc_x25.c | 12 ++++++------
1 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/net/wan/hdlc_x25.c b/drivers/net/wan/hdlc_x25.c
index c7adbb7..70527e5 100644
--- a/drivers/net/wan/hdlc_x25.c
+++ b/drivers/net/wan/hdlc_x25.c
@@ -49,14 +49,14 @@ static void x25_connect_disconnect(struct net_device *dev, int reason, int code)
static void x25_connected(struct net_device *dev, int reason)
{
- x25_connect_disconnect(dev, reason, 1);
+ x25_connect_disconnect(dev, reason, X25_IFACE_CONNECT);
}
static void x25_disconnected(struct net_device *dev, int reason)
{
- x25_connect_disconnect(dev, reason, 2);
+ x25_connect_disconnect(dev, reason, X25_IFACE_DISCONNECT);
}
@@ -71,7 +71,7 @@ static int x25_data_indication(struct net_device *dev, struct sk_buff *skb)
return NET_RX_DROP;
ptr = skb->data;
- *ptr = 0;
+ *ptr = X25_IFACE_DATA;
skb->protocol = x25_type_trans(skb, dev);
return netif_rx(skb);
@@ -94,13 +94,13 @@ static netdev_tx_t x25_xmit(struct sk_buff *skb, struct net_device *dev)
/* X.25 to LAPB */
switch (skb->data[0]) {
- case 0: /* Data to be transmitted */
+ case X25_IFACE_DATA: /* Data to be transmitted */
skb_pull(skb, 1);
if ((result = lapb_data_request(dev, skb)) != LAPB_OK)
dev_kfree_skb(skb);
return NETDEV_TX_OK;
- case 1:
+ case X25_IFACE_CONNECT:
if ((result = lapb_connect_request(dev))!= LAPB_OK) {
if (result == LAPB_CONNECTED)
/* Send connect confirm. msg to level 3 */
@@ -112,7 +112,7 @@ static netdev_tx_t x25_xmit(struct sk_buff *skb, struct net_device *dev)
}
break;
- case 2:
+ case X25_IFACE_DISCONNECT:
if ((result = lapb_disconnect_request(dev)) != LAPB_OK) {
if (result == LAPB_NOTCONNECTED)
/* Send disconnect confirm. msg to level 3 */
--
1.5.6.5
^ permalink raw reply related
* [PATCH 5/6] X25: Use identifiers for cyclades device to x25 interface
From: Andrew Hendry @ 2010-04-18 9:53 UTC (permalink / raw)
To: netdev, acme
Change magic numbers to identifiers for X25 interface.
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
---
drivers/net/wan/cycx_x25.c | 13 ++++++++-----
1 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wan/cycx_x25.c b/drivers/net/wan/cycx_x25.c
index cd8cb95..cf9e15f 100644
--- a/drivers/net/wan/cycx_x25.c
+++ b/drivers/net/wan/cycx_x25.c
@@ -634,11 +634,12 @@ static netdev_tx_t cycx_netdevice_hard_start_xmit(struct sk_buff *skb,
}
} else { /* chan->protocol == ETH_P_X25 */
switch (skb->data[0]) {
- case 0: break;
- case 1: /* Connect request */
+ case X25_IFACE_DATA:
+ break;
+ case X25_IFACE_CONNECT:
cycx_x25_chan_connect(dev);
goto free_packet;
- case 2: /* Disconnect request */
+ case X25_IFACE_DISCONNECT:
cycx_x25_chan_disconnect(dev);
goto free_packet;
default:
@@ -1406,7 +1407,8 @@ static void cycx_x25_set_chan_state(struct net_device *dev, u8 state)
reset_timer(dev);
if (chan->protocol == ETH_P_X25)
- cycx_x25_chan_send_event(dev, 1);
+ cycx_x25_chan_send_event(dev,
+ X25_IFACE_CONNECT);
break;
case WAN_CONNECTING:
@@ -1424,7 +1426,8 @@ static void cycx_x25_set_chan_state(struct net_device *dev, u8 state)
}
if (chan->protocol == ETH_P_X25)
- cycx_x25_chan_send_event(dev, 2);
+ cycx_x25_chan_send_event(dev,
+ X25_IFACE_DISCONNECT);
netif_wake_queue(dev);
break;
--
1.5.6.5
^ permalink raw reply related
* [PATCH 4/6] X25: Use identifiers for lapbether device to x25 interface
From: Andrew Hendry @ 2010-04-18 9:53 UTC (permalink / raw)
To: netdev
Change magic numbers to identifiers for X25 interface.
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
---
drivers/net/wan/lapbether.c | 12 ++++++------
1 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/net/wan/lapbether.c b/drivers/net/wan/lapbether.c
index 98e2f99..4d4dc38 100644
--- a/drivers/net/wan/lapbether.c
+++ b/drivers/net/wan/lapbether.c
@@ -139,7 +139,7 @@ static int lapbeth_data_indication(struct net_device *dev, struct sk_buff *skb)
return NET_RX_DROP;
ptr = skb->data;
- *ptr = 0x00;
+ *ptr = X25_IFACE_DATA;
skb->protocol = x25_type_trans(skb, dev);
return netif_rx(skb);
@@ -161,14 +161,14 @@ static netdev_tx_t lapbeth_xmit(struct sk_buff *skb,
goto drop;
switch (skb->data[0]) {
- case 0x00:
+ case X25_IFACE_DATA:
break;
- case 0x01:
+ case X25_IFACE_CONNECT:
if ((err = lapb_connect_request(dev)) != LAPB_OK)
printk(KERN_ERR "lapbeth: lapb_connect_request "
"error: %d\n", err);
goto drop;
- case 0x02:
+ case X25_IFACE_DISCONNECT:
if ((err = lapb_disconnect_request(dev)) != LAPB_OK)
printk(KERN_ERR "lapbeth: lapb_disconnect_request "
"err: %d\n", err);
@@ -225,7 +225,7 @@ static void lapbeth_connected(struct net_device *dev, int reason)
}
ptr = skb_put(skb, 1);
- *ptr = 0x01;
+ *ptr = X25_IFACE_CONNECT;
skb->protocol = x25_type_trans(skb, dev);
netif_rx(skb);
@@ -242,7 +242,7 @@ static void lapbeth_disconnected(struct net_device *dev, int reason)
}
ptr = skb_put(skb, 1);
- *ptr = 0x02;
+ *ptr = X25_IFACE_DISCONNECT;
skb->protocol = x25_type_trans(skb, dev);
netif_rx(skb);
--
1.5.6.5
^ permalink raw reply related
* [PATCH 3/6] X25: Use identifiers for x25 async device to x25 interface
From: Andrew Hendry @ 2010-04-18 9:53 UTC (permalink / raw)
To: netdev
Change magic numbers to identifiers for X25 interface.
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
---
drivers/net/wan/x25_asy.c | 12 ++++++------
1 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/net/wan/x25_asy.c b/drivers/net/wan/x25_asy.c
index 80d5c58..166e77d 100644
--- a/drivers/net/wan/x25_asy.c
+++ b/drivers/net/wan/x25_asy.c
@@ -29,12 +29,12 @@
#include <linux/etherdevice.h>
#include <linux/skbuff.h>
#include <linux/if_arp.h>
-#include <linux/x25.h>
#include <linux/lapb.h>
#include <linux/init.h>
#include <linux/rtnetlink.h>
#include <linux/compat.h>
#include <linux/slab.h>
+#include <net/x25device.h>
#include "x25_asy.h"
#include <net/x25device.h>
@@ -315,15 +315,15 @@ static netdev_tx_t x25_asy_xmit(struct sk_buff *skb,
}
switch (skb->data[0]) {
- case 0x00:
+ case X25_IFACE_DATA:
break;
- case 0x01: /* Connection request .. do nothing */
+ case X25_IFACE_CONNECT: /* Connection request .. do nothing */
err = lapb_connect_request(dev);
if (err != LAPB_OK)
printk(KERN_ERR "x25_asy: lapb_connect_request error - %d\n", err);
kfree_skb(skb);
return NETDEV_TX_OK;
- case 0x02: /* Disconnect request .. do nothing - hang up ?? */
+ case X25_IFACE_DISCONNECT: /* do nothing - hang up ?? */
err = lapb_disconnect_request(dev);
if (err != LAPB_OK)
printk(KERN_ERR "x25_asy: lapb_disconnect_request error - %d\n", err);
@@ -411,7 +411,7 @@ static void x25_asy_connected(struct net_device *dev, int reason)
}
ptr = skb_put(skb, 1);
- *ptr = 0x01;
+ *ptr = X25_IFACE_CONNECT;
skb->protocol = x25_type_trans(skb, sl->dev);
netif_rx(skb);
@@ -430,7 +430,7 @@ static void x25_asy_disconnected(struct net_device *dev, int reason)
}
ptr = skb_put(skb, 1);
- *ptr = 0x02;
+ *ptr = X25_IFACE_DISCONNECT;
skb->protocol = x25_type_trans(skb, sl->dev);
netif_rx(skb);
--
1.5.6.5
^ permalink raw reply related
* [PATCH 2/6] X25: Use identifiers for isdn device to x25 interface
From: Andrew Hendry @ 2010-04-18 9:52 UTC (permalink / raw)
To: netdev, isdn
Change magic numbers to identifiers for X25 interface.
also minor check patch formatting.
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
---
drivers/isdn/i4l/isdn_x25iface.c | 17 +++++++++--------
1 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/drivers/isdn/i4l/isdn_x25iface.c b/drivers/isdn/i4l/isdn_x25iface.c
index efcf1f9..fd10d7c 100644
--- a/drivers/isdn/i4l/isdn_x25iface.c
+++ b/drivers/isdn/i4l/isdn_x25iface.c
@@ -194,7 +194,7 @@ static int isdn_x25iface_receive(struct concap_proto *cprot, struct sk_buff *skb
if ( ( (ix25_pdata_t*) (cprot->proto_data) )
-> state == WAN_CONNECTED ){
if( skb_push(skb, 1)){
- skb -> data[0]=0x00;
+ skb->data[0] = X25_IFACE_DATA;
skb->protocol = x25_type_trans(skb, cprot->net_dev);
netif_rx(skb);
return 0;
@@ -224,7 +224,7 @@ static int isdn_x25iface_connect_ind(struct concap_proto *cprot)
skb = dev_alloc_skb(1);
if( skb ){
- *( skb_put(skb, 1) ) = 0x01;
+ *(skb_put(skb, 1)) = X25_IFACE_CONNECT;
skb->protocol = x25_type_trans(skb, cprot->net_dev);
netif_rx(skb);
return 0;
@@ -253,7 +253,7 @@ static int isdn_x25iface_disconn_ind(struct concap_proto *cprot)
*state_p = WAN_DISCONNECTED;
skb = dev_alloc_skb(1);
if( skb ){
- *( skb_put(skb, 1) ) = 0x02;
+ *(skb_put(skb, 1)) = X25_IFACE_DISCONNECT;
skb->protocol = x25_type_trans(skb, cprot->net_dev);
netif_rx(skb);
return 0;
@@ -272,9 +272,10 @@ static int isdn_x25iface_xmit(struct concap_proto *cprot, struct sk_buff *skb)
unsigned char firstbyte = skb->data[0];
enum wan_states *state = &((ix25_pdata_t*)cprot->proto_data)->state;
int ret = 0;
- IX25DEBUG( "isdn_x25iface_xmit: %s first=%x state=%d \n", MY_DEVNAME(cprot -> net_dev), firstbyte, *state );
+ IX25DEBUG("isdn_x25iface_xmit: %s first=%x state=%d\n",
+ MY_DEVNAME(cprot->net_dev), firstbyte, *state);
switch ( firstbyte ){
- case 0x00: /* dl_data request */
+ case X25_IFACE_DATA:
if( *state == WAN_CONNECTED ){
skb_pull(skb, 1);
cprot -> net_dev -> trans_start = jiffies;
@@ -285,7 +286,7 @@ static int isdn_x25iface_xmit(struct concap_proto *cprot, struct sk_buff *skb)
}
illegal_state_warn( *state, firstbyte );
break;
- case 0x01: /* dl_connect request */
+ case X25_IFACE_CONNECT:
if( *state == WAN_DISCONNECTED ){
*state = WAN_CONNECTING;
ret = cprot -> dops -> connect_req(cprot);
@@ -298,7 +299,7 @@ static int isdn_x25iface_xmit(struct concap_proto *cprot, struct sk_buff *skb)
illegal_state_warn( *state, firstbyte );
}
break;
- case 0x02: /* dl_disconnect request */
+ case X25_IFACE_DISCONNECT:
switch ( *state ){
case WAN_DISCONNECTED:
/* Should not happen. However, give upper layer a
@@ -318,7 +319,7 @@ static int isdn_x25iface_xmit(struct concap_proto *cprot, struct sk_buff *skb)
illegal_state_warn( *state, firstbyte );
}
break;
- case 0x03: /* changing lapb parameters requested */
+ case X25_IFACE_PARAMS:
printk(KERN_WARNING "isdn_x25iface_xmit: setting of lapb"
" options not yet supported\n");
break;
--
1.5.6.5
^ permalink raw reply related
* [PATCH 1/6] X25: Use identifiers for X25 to device interface
From: Andrew Hendry @ 2010-04-18 9:51 UTC (permalink / raw)
To: netdev
Use identifiers in x25_device.h instead of magic numbers for X25 layer 3 to device interface.
Also fixed checkpatch notes on updated code.
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
---
Documentation/networking/x25-iface.txt | 16 +++++++-------
include/net/x25device.h | 8 +++++++
net/x25/x25_dev.c | 36 +++++++++++++++++--------------
3 files changed, 36 insertions(+), 24 deletions(-)
diff --git a/Documentation/networking/x25-iface.txt b/Documentation/networking/x25-iface.txt
index 975cc87..78f662e 100644
--- a/Documentation/networking/x25-iface.txt
+++ b/Documentation/networking/x25-iface.txt
@@ -20,23 +20,23 @@ the rest of the skbuff, if any more information does exist.
Packet Layer to Device Driver
-----------------------------
-First Byte = 0x00
+First Byte = 0x00 (X25_IFACE_DATA)
This indicates that the rest of the skbuff contains data to be transmitted
over the LAPB link. The LAPB link should already exist before any data is
passed down.
-First Byte = 0x01
+First Byte = 0x01 (X25_IFACE_CONNECT)
Establish the LAPB link. If the link is already established then the connect
confirmation message should be returned as soon as possible.
-First Byte = 0x02
+First Byte = 0x02 (X25_IFACE_DISCONNECT)
Terminate the LAPB link. If it is already disconnected then the disconnect
confirmation message should be returned as soon as possible.
-First Byte = 0x03
+First Byte = 0x03 (X25_IFACE_PARAMS)
LAPB parameters. To be defined.
@@ -44,22 +44,22 @@ LAPB parameters. To be defined.
Device Driver to Packet Layer
-----------------------------
-First Byte = 0x00
+First Byte = 0x00 (X25_IFACE_DATA)
This indicates that the rest of the skbuff contains data that has been
received over the LAPB link.
-First Byte = 0x01
+First Byte = 0x01 (X25_IFACE_CONNECT)
LAPB link has been established. The same message is used for both a LAPB
link connect_confirmation and a connect_indication.
-First Byte = 0x02
+First Byte = 0x02 (X25_IFACE_DISCONNECT)
LAPB link has been terminated. This same message is used for both a LAPB
link disconnect_confirmation and a disconnect_indication.
-First Byte = 0x03
+First Byte = 0x03 (X25_IFACE_PARAMS)
LAPB parameters. To be defined.
diff --git a/include/net/x25device.h b/include/net/x25device.h
index 1415bcf..51f8902 100644
--- a/include/net/x25device.h
+++ b/include/net/x25device.h
@@ -13,4 +13,12 @@ static inline __be16 x25_type_trans(struct sk_buff *skb, struct net_device *dev)
return htons(ETH_P_X25);
}
+
+enum {
+ X25_IFACE_DATA,
+ X25_IFACE_CONNECT,
+ X25_IFACE_DISCONNECT,
+ X25_IFACE_PARAMS
+};
+
#endif
diff --git a/net/x25/x25_dev.c b/net/x25/x25_dev.c
index b9ef682..9005f6d 100644
--- a/net/x25/x25_dev.c
+++ b/net/x25/x25_dev.c
@@ -24,6 +24,7 @@
#include <net/sock.h>
#include <linux/if_arp.h>
#include <net/x25.h>
+#include <net/x25device.h>
static int x25_receive_data(struct sk_buff *skb, struct x25_neigh *nb)
{
@@ -115,19 +116,22 @@ int x25_lapb_receive_frame(struct sk_buff *skb, struct net_device *dev,
}
switch (skb->data[0]) {
- case 0x00:
- skb_pull(skb, 1);
- if (x25_receive_data(skb, nb)) {
- x25_neigh_put(nb);
- goto out;
- }
- break;
- case 0x01:
- x25_link_established(nb);
- break;
- case 0x02:
- x25_link_terminated(nb);
- break;
+
+ case X25_IFACE_DATA:
+ skb_pull(skb, 1);
+ if (x25_receive_data(skb, nb)) {
+ x25_neigh_put(nb);
+ goto out;
+ }
+ break;
+
+ case X25_IFACE_CONNECT:
+ x25_link_established(nb);
+ break;
+
+ case X25_IFACE_DISCONNECT:
+ x25_link_terminated(nb);
+ break;
}
x25_neigh_put(nb);
drop:
@@ -148,7 +152,7 @@ void x25_establish_link(struct x25_neigh *nb)
return;
}
ptr = skb_put(skb, 1);
- *ptr = 0x01;
+ *ptr = X25_IFACE_CONNECT;
break;
#if defined(CONFIG_LLC) || defined(CONFIG_LLC_MODULE)
@@ -184,7 +188,7 @@ void x25_terminate_link(struct x25_neigh *nb)
}
ptr = skb_put(skb, 1);
- *ptr = 0x02;
+ *ptr = X25_IFACE_DISCONNECT;
skb->protocol = htons(ETH_P_X25);
skb->dev = nb->dev;
@@ -200,7 +204,7 @@ void x25_send_frame(struct sk_buff *skb, struct x25_neigh *nb)
switch (nb->dev->type) {
case ARPHRD_X25:
dptr = skb_push(skb, 1);
- *dptr = 0x00;
+ *dptr = X25_IFACE_DATA;
break;
#if defined(CONFIG_LLC) || defined(CONFIG_LLC_MODULE)
--
1.5.6.5
^ permalink raw reply related
* Re: [PATCH net-next-2.6] net: Introduce skb_orphan_try()
From: David Miller @ 2010-04-18 9:46 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1271456302.16881.4559.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 17 Apr 2010 00:18:22 +0200
> Le jeudi 15 avril 2010 à 14:33 -0700, David Miller a écrit :
>
>> If it's not legal to skb_orphan() here then it would not be legal for
>> the drivers to unconditionally skb_orphan(), which they do.
>>
>> So either your test is unnecessary, or we have a big existing problem
>> :-)
>
> I cooked following patch, introducing skb_orphan_try() helper, to
> document all known exceptions.
Looks good, applied, thanks Eric.
That timestamping issue, I bet there is some simple solution hiding
in the bushes for that one?
Hmmm, something like... there is a less transient piece of state
(perhaps in the socket) and the skb has a pointer to that piece of
state. Then the driver writes into this pointed-to place rather
into some member of the skb itself.
Then all of these problems with referencing the skb metadata across
being passed to ->ndo_start_xmit() could go away I think.
> I have a possible followup for this patch :
>
> Orphaning skbs earlier could also make dev_kfree_skb_irq() faster.
> Instead of queing skb into completion_queue and triggering
> NET_TX_SOFTIRQ, we would directly free an orphaned skb ?
Sounds great. But I don't see dev_kfree_skb_irq() as much of a
performance priority, since all sane modern drivers free skbs from
softirq context (via NAPI).
^ permalink raw reply
* Re: rps perfomance WAS(Re: rps: question
From: Eric Dumazet @ 2010-04-18 9:39 UTC (permalink / raw)
To: hadi; +Cc: Changli Gao, Rick Jones, David Miller, therbert, netdev, robert,
andi
In-Reply-To: <1271525519.3929.3.camel@bigi>
Le samedi 17 avril 2010 à 13:31 -0400, jamal a écrit :
> On Sat, 2010-04-17 at 09:35 +0200, Eric Dumazet wrote:
>
> > I did some tests on a dual quad core machine (E5450 @ 3.00GHz), not
> > nehalem. So a 3-4 years old design.
>
> Eric, I thank you kind sir for going out of your way to do this - it is
> certainly a good processor to compare against
>
> > For all test, I use the best time of 3 runs of "ping -f -q -c 100000
> > 192.168.0.2". Yes ping is not very good, but its available ;)
>
> It is a reasonable quick test, no fancy setup required ;->
>
> > Note: I make sure all 8 cpus of target are busy, eating cpu cycles in
> > user land.
>
> I didnt keep the cpus busy. I should re-run with such a setup, any
> specific app that you used to keep them busy? Keeping them busy could
> have consequences; I am speculating you probably ended having greater
> than one packet/IPI ratio i.e amortization benefit..
No, only one packet per IPI, since I setup my tg3 coalescing parameter
to the minimum value, I received one packet per interrupt.
The specific app is :
for f in `seq 1 8`; do while :; do :; done& done
>
> > I dont want to tweak acpi or whatever smart power saving
> > mechanisms.
>
> I should mention i turned off acpi as well in the bios; it was consuming
> more cpu cycles than net-processing and was interfering in my tests.
>
> > When RPS off
> > 100000 packets transmitted, 100000 received, 0% packet loss, time 4160ms
> >
> > RPS on, but directed on the cpu0 handling device interrupts (tg3, napi)
> > (echo 01 > /sys/class/net/eth3/queues/rx-0/rps_cpus)
> > 100000 packets transmitted, 100000 received, 0% packet loss, time 4234ms
> >
> > So the cost of queing the packet into our own queue (netif_receive_skb
> > -> enqueue_to_backlog) is about 0.74 us (74 ms / 100000)
> >
>
> Excellent analysis.
>
> > I personally think we should process packet instead of queeing it, but
> > Tom disagree with me.
>
> Sorry - I am gonna have to turn on some pedagogy and offer my
> Canadian 2 cents;->
> I would lean on agreeing with Tom, but maybe go one step further (sans
> packet-reordering): we should never process packets to socket layer on
> the demuxing cpu.
> enqueue everything you receive on a different cpu - so somehow receiving
> cpu becomes part of a hashing decision ...
>
> The reason is derived from queueing theory - of which i know dangerously
> little - but refer you to mr. little his-self[1] (pun fully
> intended;->):
> i.e fixed serving time provides more predictable results as opposed to
> once in a while a spike as you receive packets destined to "our cpu".
> Queueing packets and later allocating cycles to processing them adds to
> variability, but is not as bad as processing to completion to socket
> layer.
>
> > RPS on, directed on cpu1 (other socket)
> > (echo 02 > /sys/class/net/eth3/queues/rx-0/rps_cpus)
> > 100000 packets transmitted, 100000 received, 0% packet loss, time 4542ms
>
> Good test - should be worst case scenario. But there are two other
> scenarios which will give different results in my opinion.
> On your setup i think each socket has two dies, each with two cores. So
> my feeling is you will get different numbers if you go within same die
> and across dies within same socket. If i am not mistaken, the mapping
> would be something like socket0/die0{core0/2}, socket0/die1{core4/6},
> socket1/die0{core1/3}, socket1{core5/7}.
> If you have cycles can you try the same socket+die but different cores
> and same socket but different die test?
Sure, lets redo a full test, taking lowest time of three ping runs
echo 00 >/sys/class/net/eth3/queues/rx-0/rps_cpus
100000 packets transmitted, 100000 received, 0% packet loss, time 4151ms
echo 01 >/sys/class/net/eth3/queues/rx-0/rps_cpus
100000 packets transmitted, 100000 received, 0% packet loss, time 4254ms
echo 02 >/sys/class/net/eth3/queues/rx-0/rps_cpus
100000 packets transmitted, 100000 received, 0% packet loss, time 4563ms
echo 04 >/sys/class/net/eth3/queues/rx-0/rps_cpus
100000 packets transmitted, 100000 received, 0% packet loss, time 4458ms
echo 08 >/sys/class/net/eth3/queues/rx-0/rps_cpus
100000 packets transmitted, 100000 received, 0% packet loss, time 4563ms
echo 10 >/sys/class/net/eth3/queues/rx-0/rps_cpus
100000 packets transmitted, 100000 received, 0% packet loss, time 4327ms
echo 20 >/sys/class/net/eth3/queues/rx-0/rps_cpus
100000 packets transmitted, 100000 received, 0% packet loss, time 4571ms
echo 40 >/sys/class/net/eth3/queues/rx-0/rps_cpus
100000 packets transmitted, 100000 received, 0% packet loss, time 4472ms
echo 80 >/sys/class/net/eth3/queues/rx-0/rps_cpus
100000 packets transmitted, 100000 received, 0% packet loss, time 4568ms
# egrep "physical id|core|apicid" /proc/cpuinfo
physical id : 0
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
physical id : 1
core id : 0
cpu cores : 4
apicid : 4
initial apicid : 4
physical id : 0
core id : 2
cpu cores : 4
apicid : 2
initial apicid : 2
physical id : 1
core id : 2
cpu cores : 4
apicid : 6
initial apicid : 6
physical id : 0
core id : 1
cpu cores : 4
apicid : 1
initial apicid : 1
physical id : 1
core id : 1
cpu cores : 4
apicid : 5
initial apicid : 5
physical id : 0
core id : 3
cpu cores : 4
apicid : 3
initial apicid : 3
physical id : 1
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
^ permalink raw reply
* Re: [PATCH net-next-2.6] net: remove time limit in process_backlog()
From: David Miller @ 2010-04-18 9:36 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1271513822.16881.4734.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 17 Apr 2010 16:17:02 +0200
> - There is no point to enforce a time limit in process_backlog(), since
> other napi instances dont follow same rule. We can exit after only one
> packet processed...
> The normal quota of 64 packets per napi instance should be the norm, and
> net_rx_action() already has its own time limit.
> Note : /proc/net/core/dev_weight can be used to tune this 64 default
> value.
>
> - Use DEFINE_PER_CPU_ALIGNED for softnet_data definition.
>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Yep, doing this time limit at two levels is pointless.
Applied, thanks Eric!
^ permalink raw reply
* ixgbe - problem with packet/bytes count on all queues
From: Paweł Staszewski @ 2010-04-18 9:14 UTC (permalink / raw)
To: Linux Network Development list
Hello
I want to ask is this a normal behavior of ixgb driver and 82598EB nic.
look for tx_queue_7 stats:
ethtool -S eth2
NIC statistics:
rx_packets: 35103252
tx_packets: 1770371731
rx_bytes: 3602052416
tx_bytes: 1369778276
rx_pkts_nic: 138121006018
tx_pkts_nic: 122033163226
rx_bytes_nic: 101484528847981
tx_bytes_nic: 92258799092069
lsc_int: 1
tx_busy: 0
non_eop_descs: 0
rx_errors: 0
tx_errors: 0
rx_dropped: 0
tx_dropped: 0
multicast: 490226
broadcast: 124104912
rx_no_buffer_count: 0
collisions: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
hw_rsc_aggregated: 0
hw_rsc_flushed: 0
fdir_match: 0
fdir_miss: 0
rx_fifo_errors: 0
rx_missed_errors: 0
tx_aborted_errors: 0
tx_carrier_errors: 0
tx_fifo_errors: 0
tx_heartbeat_errors: 0
tx_timeout_count: 0
tx_restart_queue: 111130
rx_long_length_errors: 38599
rx_short_length_errors: 0
tx_flow_control_xon: 0
rx_flow_control_xon: 0
tx_flow_control_xoff: 0
rx_flow_control_xoff: 0
rx_csum_offload_errors: 1554191
alloc_rx_page_failed: 0
alloc_rx_buff_failed: 0
rx_no_dma_resources: 0
tx_queue_0_packets: 108685351623
tx_queue_0_bytes: 79701402025544
tx_queue_1_packets: 3988024698
tx_queue_1_bytes: 3353530467775
tx_queue_2_packets: 1893305707
tx_queue_2_bytes: 1705357186034
tx_queue_3_packets: 1787852613
tx_queue_3_bytes: 1518632482370
tx_queue_4_packets: 1843108684
tx_queue_4_bytes: 1641474602504
tx_queue_5_packets: 1882637467
tx_queue_5_bytes: 1629905766993
tx_queue_6_packets: 1952759802
tx_queue_6_bytes: 1680666591771
tx_queue_7_packets: 0
tx_queue_7_bytes: 0
rx_queue_0_packets: 17361735592
rx_queue_0_bytes: 12585728518077
rx_queue_1_packets: 17194262916
rx_queue_1_bytes: 12518731583464
rx_queue_2_packets: 17342312348
rx_queue_2_bytes: 12734959063176
rx_queue_3_packets: 17367632051
rx_queue_3_bytes: 12656219984521
rx_queue_4_packets: 17150307164
rx_queue_4_bytes: 12408526754019
rx_queue_5_packets: 17206721842
rx_queue_5_bytes: 12470666039893
rx_queue_6_packets: 17202210572
rx_queue_6_bytes: 12431429298950
rx_queue_7_packets: 17295822822
rx_queue_7_bytes: 12573299488239
and here look at multiq queue number 8:
tc -s -d class show dev eth2
class multiq 1:1 parent 1:
Sent 6905560675905 bytes 510743840 pkt (dropped 0, overlimits 0
requeues 0)
backlog 0b 0p requeues 0
class multiq 1:2 parent 1:
Sent 280699743990 bytes 330210442 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
class multiq 1:3 parent 1:
Sent 128528666971 bytes 142053106 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
class multiq 1:4 parent 1:
Sent 123086710694 bytes 140454119 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
class multiq 1:5 parent 1:
Sent 121027779083 bytes 146164066 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
class multiq 1:6 parent 1:
Sent 116245520195 bytes 141597610 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
class multiq 1:7 parent 1:
Sent 133310553887 bytes 151141714 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
class multiq 1:8 parent 1:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Is that normal that driver don't use queue number 8 ?
^ permalink raw reply
* Re: [PATCH] gigaset: include cleanup cleanup
From: David Miller @ 2010-04-18 9:13 UTC (permalink / raw)
To: tj; +Cc: tilman, isdn, hjlipp, i4ldeveloper, netdev, linux-kernel
In-Reply-To: <4BCA6B8E.9000408@kernel.org>
From: Tejun Heo <tj@kernel.org>
Date: Sun, 18 Apr 2010 11:16:46 +0900
>
>> Seeing that the "include cleanup" patch triggering this was accepted
>> after the merge window, I have hopes this one will be accepted, too.
>
> Hmm... through which tree should this go through? I can route it
> through percpu but maybe taking the usual isdn patch path would be
> better?
I'll take it into net-2.6, no worries.
^ permalink raw reply
* Re: [PATCH] TCP: avoid to send keepalive probes if it is receiving data
From: Eric Dumazet @ 2010-04-18 9:06 UTC (permalink / raw)
To: Flavio Leitner; +Cc: netdev
In-Reply-To: <1271525305-28423-1-git-send-email-fleitner@redhat.com>
Le samedi 17 avril 2010 à 14:28 -0300, Flavio Leitner a écrit :
> RFC 1122 says the following:
> ...
> Keep-alive packets MUST only be sent when no data or
> acknowledgement packets have been received for the
> connection within an interval.
> ...
>
> Fix this by storing the timestamp of last received data
> packet and checking for it when the keepalive timer expires.
>
> Signed-off-by: Flavio Leitner <fleitner@redhat.com>
Thanks Flavio !
Shouldnt you also change do_tcp_setsockopt() TCP_KEEPIDLE for
consistency ?
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 0f8caf6..a4048d7 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2298,7 +2298,10 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
if (sock_flag(sk, SOCK_KEEPOPEN) &&
!((1 << sk->sk_state) &
(TCPF_CLOSE | TCPF_LISTEN))) {
- __u32 elapsed = tcp_time_stamp - tp->rcv_tstamp;
+ u32 elapsed = min_t(u32,
+ tcp_time_stamp - tp->rcv_tstamp,
+ tcp_time_stamp - tp->lrcvtime);
+
if (tp->keepalive_time > elapsed)
elapsed = tp->keepalive_time - elapsed;
else
> ---
> include/linux/tcp.h | 1 +
> net/ipv4/tcp_input.c | 3 +++
> net/ipv4/tcp_timer.c | 8 ++++++++
> 3 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/tcp.h b/include/linux/tcp.h
> index a778ee0..405678f 100644
> --- a/include/linux/tcp.h
> +++ b/include/linux/tcp.h
> @@ -314,6 +314,7 @@ struct tcp_sock {
> u32 snd_sml; /* Last byte of the most recently transmitted small packet */
> u32 rcv_tstamp; /* timestamp of last received ACK (for keepalives) */
> u32 lsndtime; /* timestamp of last sent data packet (for restart window) */
> + u32 lrcvtime; /* timestamp of last received data packet (for keepalives) */
>
> /* Data for direct copy to user */
> struct {
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index f240f57..60d2980 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -5391,6 +5391,8 @@ no_ack:
> __kfree_skb(skb);
> else
> sk->sk_data_ready(sk, 0);
> +
> + tp->lrcvtime = tcp_time_stamp;
> return 0;
> }
> }
> @@ -5421,6 +5423,7 @@ step5:
>
> tcp_data_snd_check(sk);
> tcp_ack_snd_check(sk);
> + tp->lrcvtime = tcp_time_stamp;
> return 0;
>
> csum_error:
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 8a0ab29..74dd804 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -554,6 +554,14 @@ static void tcp_keepalive_timer (unsigned long data)
> if (tp->packets_out || tcp_send_head(sk))
> goto resched;
>
> + elapsed = tcp_time_stamp - tp->lrcvtime;
> +
> + /* receiving data means alive */
> + if (elapsed < keepalive_time_when(tp)) {
> + elapsed = keepalive_time_when(tp) - elapsed;
> + goto resched;
> + }
> +
> elapsed = tcp_time_stamp - tp->rcv_tstamp;
>
> if (elapsed >= keepalive_time_when(tp)) {
^ permalink raw reply related
* Re: [PATCH] gigaset: include cleanup cleanup
From: Tejun Heo @ 2010-04-18 2:16 UTC (permalink / raw)
To: Tilman Schmidt
Cc: Karsten Keil, David Miller, Hansjoerg Lipp, i4ldeveloper, netdev,
linux-kernel
In-Reply-To: <20100416220858.A19F540123@xenon.ts.pxnet.com>
Hello,
On 04/17/2010 07:08 AM, Tilman Schmidt wrote:
> Commit 5a0e3ad causes slab.h to be included twice in many of the
> Gigaset driver's source files, first via the common include file
> gigaset.h and then a second time directly. Drop the spares, and
> use the opportunity to clean up a few more similar cases.
>
> Impact: cleanup, no functional change
> Signed-off-by: Tilman Schmidt <tilman@imap.cc>
> CC: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Thanks for the clean up.
> Seeing that the "include cleanup" patch triggering this was accepted
> after the merge window, I have hopes this one will be accepted, too.
Hmm... through which tree should this go through? I can route it
through percpu but maybe taking the usual isdn patch path would be
better?
Thanks.
--
tejun
^ permalink raw reply
* [PATCH] X25 fix dead unaccepted sockets
From: Andrew Hendry @ 2010-04-18 0:17 UTC (permalink / raw)
To: netdev
1, An X25 program binds and listens
2, calls arrive waiting to be accepted
3, Program exits without accepting
4, Sockets time out but don't get correctly cleaned up
5, cat /proc/net/x25/socket shows the dead sockets with bad inode fields.
This line borrowed from AX25 sets the dying socket so the timers clean up later.
Signed-off-by: Andrew Hendry <andrew.hendry@gmail.com>
---
net/x25/af_x25.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index cbddd0c..36e84e1 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -402,6 +402,7 @@ static void __x25_destroy_socket(struct sock *sk)
/*
* Queue the unaccepted socket for death
*/
+ skb->sk->sk_state = TCP_LISTEN;
sock_set_flag(skb->sk, SOCK_DEAD);
x25_start_heartbeat(skb->sk);
x25_sk(skb->sk)->state = X25_STATE_0;
--
1.5.6.5
^ permalink raw reply related
* Re: [PATCH] Fix SCTP failure with ipv6 source address routing
From: Paul Gortmaker @ 2010-04-18 0:17 UTC (permalink / raw)
To: Vlad Yasevich; +Cc: netdev
In-Reply-To: <4BC510A2.1070105@hp.com>
On 10-04-13 08:47 PM, Vlad Yasevich wrote:
>
>
> Paul Gortmaker wrote:
>> From: Weixing Shi<Weixing.Shi@windriver.com>
>>
>> Given the below test case, using source address routing, SCTP
>> does not work.
>>
>> Node-A:
>> 1)ifconfig eth0 inet6 add 2001:1::1/64
>> 2)ip -6 rule add from 2001:1::1 table 100 pref 100
>> 3)ip -6 route add 2001:2::1 dev eth0 table 100
>> 4)sctp_darn -H 2001:1::1 -P 250 -l&
>>
>> Node-B:
>> 1)ifconfig eth0 inet6 add 2001:2::1/64
>> 2)ip -6 rule add from 2001:2::1 table 100 pref 100
>> 3)ip -6 route add 2001:1::1 dev eth0 table 100
>> 4)sctp_darn -H 2001:2::1 -P 250 -h 2001:1::1 -p 250 -s
>>
>> Root cause:
>> Node-A and Node-B use source address routing, and in the
>> begining, the source address will be NULL. So SCTP will search
>> the routing table by the destination address (because it is using
>> the source address routing table), and hence the resulting dst_entry
>> will be NULL.
>>
>> Solution:
>> After SCTP gets the correct source address, then we search for
>> dst_entry again, and then we will get the correct value.
>
> The problem here is that ipv6 route lookup code in sctp doesn't bother
> searching for the source address, unlike the v4 route lookup code.
>
> Compare sctp_v4_get_dst() and sctp_v6_get_dst. The v4 version bends over
> backwards trying to get the correct route, while the v6 version simple does
> a single lookup and returns the result.
>
> The v6 route lookup code needs to be fixed to take into account the bound
> address list.
Thanks for the feedback -- we'll take a look and see if we can
fix it as per your recommendation and re-test.
Paul.
>
> -vlad
>
>>
>> Signed-off-by: Weixing Shi<Weixing.Shi@windriver.com>
>> Signed-off-by: Paul Gortmaker<paul.gortmaker@windriver.com>
>> ---
>> net/sctp/transport.c | 11 +++++++++--
>> 1 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/sctp/transport.c b/net/sctp/transport.c
>> index be4d63d..b5ae18c 100644
>> --- a/net/sctp/transport.c
>> +++ b/net/sctp/transport.c
>> @@ -295,9 +295,16 @@ void sctp_transport_route(struct sctp_transport *transport,
>>
>> if (saddr)
>> memcpy(&transport->saddr, saddr, sizeof(union sctp_addr));
>> - else
>> + else {
>> af->get_saddr(opt, asoc, dst, daddr,&transport->saddr);
>> -
>> + /* When using source address routing, since dst was
>> + * looked up prior to filling in the source address, dst
>> + * needs to be looked up again to get the correct dst
>> + */
>> + if (dst)
>> + dst_release(dst);
>> + dst = af->get_dst(asoc, daddr,&transport->saddr);
>> + }
>> transport->dst = dst;
>> if ((transport->param_flags& SPP_PMTUD_DISABLE)&& transport->pathmtu) {
>> return;
^ permalink raw reply
* Re: [PATCH v5] rfs: Receive Flow Steering
From: Changli Gao @ 2010-04-18 0:06 UTC (permalink / raw)
To: Tom Herbert; +Cc: Eric Dumazet, David Miller, netdev
In-Reply-To: <h2h65634d661004171038g75160e7avae118dfd1cb1441d@mail.gmail.com>
On Sun, Apr 18, 2010 at 1:38 AM, Tom Herbert <therbert@google.com> wrote:
> That's cool!, but I still like the idea that this hash is treated as
> an opaque value getting the hash from the device to avoid the jhash
> or cache misses on the packet can also be a win... Maybe connection
> tracking/firewall could use the skb->rxhash which provides the
> consistency and also eliminates the need to do more jhashes.
>
consistent rxhash only adds the risk of the hash collision, and I
don't think it is a big problem. For connection tracking/firewall use,
I am afraid that we have to recompute this value after defrag. So we
have to export the hash function we used in RPS.
As NIC's hash function can be changed dynamically, the rxhash isn't
consistent, so the rxhash can't be used by connection tracking, socket
lookup and others come later.
--
Regards,
Changli Gao(xiaosuo@gmail.com)
^ permalink raw reply
* Re: HTB - What's the minimal value for 'rate' parameter?
From: Jarek Poplawski @ 2010-04-17 21:01 UTC (permalink / raw)
To: Benny Amorsen; +Cc: Antonio Almeida, netdev, kaber, davem, devik
In-Reply-To: <m3bpdi8mhw.fsf@ursa.amorsen.dk>
Benny Amorsen wrote, On 04/17/2010 11:19 AM:
> Jarek Poplawski <jarkao2@gmail.com> writes:
>
>> As I wrote before, the minimal (overflow safe) rate depends on max
>> packet size, and for 1500 byte it would be something around:
>> 1500b/2min, so if your clients can wait so long, try this:
>
> Wouldn't it be nice of either tc or the kernel to warn about wrong
> configurations, or possibly reject them completely?
...or have it documented etc.
Acked-by: Jarek P. ;-)
^ permalink raw reply
* Re: Duplicate IP false alerts from arping
From: unni krishnan @ 2010-04-17 19:39 UTC (permalink / raw)
To: Pascal Hambourg; +Cc: linux-net, netdev
In-Reply-To: <4BC9784B.3020103@plouf.fr.eu.org>
Ok, then what is the best method to find the duplicate IP ( same IP
address assigned to different machines ) ?
On Sat, Apr 17, 2010 at 2:28 PM, Pascal Hambourg
<pascal.mail@plouf.fr.eu.org> wrote:
> Hello,
>
> unni krishnan a écrit :
>>
>> I am trying to find a duplicate IP in the network using arping.
>>
>> -------------------------
>> [root@vps1 ~]# ping -c 3 192.168.1.212
>> PING 192.168.1.212 (192.168.1.212) 56(84) bytes of data.
>> 64 bytes from 192.168.1.212: icmp_seq=1 ttl=64 time=1.33 ms
>> 64 bytes from 192.168.1.212: icmp_seq=2 ttl=64 time=0.280 ms
>> 64 bytes from 192.168.1.212: icmp_seq=3 ttl=64 time=0.306 ms
>>
>> --- 192.168.1.212 ping statistics ---
>> 3 packets transmitted, 3 received, 0% packet loss, time 1999ms
>> rtt min/avg/max/mdev = 0.280/0.641/1.339/0.494 ms
>> [root@vps1 ~]# arping -D -I eth0 -c 5 192.168.1.212 ; echo $?
>> ARPING 192.168.1.212 from 0.0.0.0 eth0
>> 0
>> -------------------------
>>
>> As per arping that IP is duplicate.
>
> I disagree. According to man arping :
>
> -D Duplicate address detection mode (DAD). See RFC2131, 4.4.1.
> Returns 0, if DAD succeeded i.e. no replies are received
> ^^^^^^^^^^^^^^^^^^^^^^^
> -D (DAD) is meant for DHCP to find out if the proposed IP address is not
> already assigned to another host. Its purpose is not to find out if
> multiple hosts have the same IP address. Besides, a return value of 0
> means that no ARP replies were received (IOW -D inverts the return value
> logic), which is weird since the target IP address replies to ICMP ping
> unless that address is assigned to the local host.
>
> Here :
>
> # arping -DI eth0 -c 1 192.168.0.246 ; echo result=$?
> ARPING 192.168.0.246 from 0.0.0.0 eth0
> Unicast reply from 192.168.0.246 [xx:xx:xx:xx:xx:xx] 0.964ms
> Sent 1 probes (1 broadcast(s))
> Received 1 response(s)
> result=1
>
> # arping -DI eth0 -c 1 192.168.0.24 ; echo result=$?
> ARPING 192.168.0.24 from 0.0.0.0 eth0
> Sent 1 probes (1 broadcast(s))
> Received 0 response(s)
> result=0
>
>> But if I go ahead and ifdown the
>> IP in the known location I cant ping that IP ( That means that IP is
>> not duplicated ? ). This is the result after shutting down the IP.
>>
>> --------------------------
>> [root@vps1 ~]# ping -c 3 192.168.1.212
>> PING 192.168.1.212 (192.168.1.212) 56(84) bytes of data.
>> From 192.168.1.63 icmp_seq=1 Destination Host Unreachable
>> From 192.168.1.63 icmp_seq=2 Destination Host Unreachable
>> From 192.168.1.63 icmp_seq=3 Destination Host Unreachable
>
> Ok, that means no ARP reply.
>
>> [root@vps1 ~]# arping -D -I eth0 -c 5 192.168.1.212 ; echo $?
>> ARPING 192.168.1.212 from 0.0.0.0 eth0
>> Sent 5 probes (5 broadcast(s))
>> Received 0 response(s)
>> 0
>
> Same as above.
>
>> My question is, in this case IP 192.168.1.212 is not duplicated. But
>> still arping gives duplicate status. Why it is like that ?
>
> A situation of real duplicate ARP replies may occur when the address is
> assigned to a host which has multiple interfaces connected to the same
> network, so it receives and replies to ARP queries on each interface.
>
--
Regards,
Unni
http://mutexes.org/
http://twitter.com/webofunni
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox