* [PATCH V2 07/11] dnet: enable transmit time stamping.
From: Richard Cochran @ 2011-06-10 15:24 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Ilya Yanok
In-Reply-To: <cover.1307719258.git.richard.cochran@omicron.at>
This patch enables software (and phy device) transmit time stamping
in the "Dave ethernet interface." Compile tested only.
Cc: Ilya Yanok <yanok@emcraft.com>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
drivers/net/dnet.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers/net/dnet.c b/drivers/net/dnet.c
index 8318ea0..c36763c 100644
--- a/drivers/net/dnet.c
+++ b/drivers/net/dnet.c
@@ -587,6 +587,8 @@ static netdev_tx_t dnet_start_xmit(struct sk_buff *skb, struct net_device *dev)
dnet_writel(bp, irq_enable, INTR_ENB);
}
+ skb_tx_timestamp(skb);
+
/* free the buffer */
dev_kfree_skb(skb);
--
1.7.0.4
^ permalink raw reply related
* [PATCH V2 08/11] ethoc: enable transmit time stamping.
From: Richard Cochran @ 2011-06-10 15:24 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Thierry Reding
In-Reply-To: <cover.1307719258.git.richard.cochran@omicron.at>
This patch enables software (and phy device) transmit time stamping
for the OpenCores 10/100 MAC driver. Compile tested only.
Cc: Thierry Reding <thierry.reding@avionic-design.de>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
drivers/net/ethoc.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethoc.c b/drivers/net/ethoc.c
index a83dd31..8645ec8 100644
--- a/drivers/net/ethoc.c
+++ b/drivers/net/ethoc.c
@@ -874,6 +874,7 @@ static netdev_tx_t ethoc_start_xmit(struct sk_buff *skb, struct net_device *dev)
}
spin_unlock_irq(&priv->lock);
+ skb_tx_timestamp(skb);
out:
dev_kfree_skb(skb);
return NETDEV_TX_OK;
--
1.7.0.4
^ permalink raw reply related
* [PATCH V2 09/11] r6040: enable transmit time stamping.
From: Richard Cochran @ 2011-06-10 15:24 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Florian Fainelli
In-Reply-To: <cover.1307719258.git.richard.cochran@omicron.at>
This patch enables software (and phy device) transmit time stamping
for the RDC R6040 Fast Ethernet MAC. Compile tested only.
Cc: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
drivers/net/r6040.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers/net/r6040.c b/drivers/net/r6040.c
index 200a363..5ee5f8f 100644
--- a/drivers/net/r6040.c
+++ b/drivers/net/r6040.c
@@ -846,6 +846,8 @@ static netdev_tx_t r6040_start_xmit(struct sk_buff *skb,
spin_unlock_irqrestore(&lp->lock, flags);
+ skb_tx_timestamp(skb);
+
return NETDEV_TX_OK;
}
--
1.7.0.4
^ permalink raw reply related
* [PATCH V2 10/11] stmmac: enable transmit time stamping.
From: Richard Cochran @ 2011-06-10 15:24 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Giuseppe Cavallaro
In-Reply-To: <cover.1307719258.git.richard.cochran@omicron.at>
This patch enables software (and phy device) transmit time stamping
for the STMicroelectronics Ethernet driver. Compile tested only.
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
drivers/net/stmmac/stmmac_main.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index e25e44a..5b85f4c 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -1081,6 +1081,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
priv->hw->dma->enable_dma_transmission(priv->ioaddr);
+ skb_tx_timestamp(skb);
+
return NETDEV_TX_OK;
}
--
1.7.0.4
^ permalink raw reply related
* [PATCH V2 11/11] smsc9420: enable transmit time stamping.
From: Richard Cochran @ 2011-06-10 15:24 UTC (permalink / raw)
To: netdev; +Cc: David Miller, Steve Glendinning
In-Reply-To: <cover.1307719258.git.richard.cochran@omicron.at>
This patch enables software (and phy device) transmit time stamping
for the smsc9420. Compile tested only.
Cc: Steve Glendinning <steve.glendinning@smsc.com>
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
---
drivers/net/smsc9420.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/drivers/net/smsc9420.c b/drivers/net/smsc9420.c
index 4c92ad8..2d84f92 100644
--- a/drivers/net/smsc9420.c
+++ b/drivers/net/smsc9420.c
@@ -1034,6 +1034,8 @@ static netdev_tx_t smsc9420_hard_start_xmit(struct sk_buff *skb,
smsc9420_reg_write(pd, TX_POLL_DEMAND, 1);
smsc9420_pci_flush_write(pd);
+ skb_tx_timestamp(skb);
+
return NETDEV_TX_OK;
}
--
1.7.0.4
^ permalink raw reply related
* Re: [PATCH v2 1/5] ep93xx: set DMA masks for the ep93xx_eth
From: Petr Štetiar @ 2011-06-10 15:47 UTC (permalink / raw)
To: Mika Westerberg
Cc: Petr Štetiar, netdev, kernel, hsweeten, ryan, davem,
linux-arm-kernel
In-Reply-To: <20110605085948.GB16018@acer>
Mika Westerberg <mika.westerberg@iki.fi> [2011-06-05 11:59:48]:
> On Sun, Jun 05, 2011 at 10:34:45AM +0200, Petr Štetiar wrote:
> >
> > do you have the series available somewhere for pull? I would like to test the
> > changes for you on my ts7250/ts7300, but I'm quite lazy and would like to just
> > cherry-pick the changes if possible :-)
>
> Unfortunately - no. But what's wrong with 'git am <mbox>'? ;-)
I've tested the whole series and didn't find any problems so far, thanks. You
can add my
Tested-by: Petr Štetiar <ynezz@true.cz>
if you wish.
-- ynezz
^ permalink raw reply
* Re: TCP keepalives ignored by kernel when the contain timestamps
From: Charles Bearden @ 2011-06-10 16:07 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
In-Reply-To: <4DF233CE.5020000@uth.tmc.edu>
On 06/10/2011 10:10 AM, Charles Bearden wrote:
> On 06/10/2011 08:56 AM, Eric Dumazet wrote:
>> Le jeudi 09 juin 2011 à 10:26 -0500, Charles Bearden a écrit :
>>> I have come across a case that looks like it might be a kernel bug. It appears
>>> that tcp keepalives sent by a remote system are ignored when they contain tcp
>>> timestamps, but are ACKed when they don't. When they are ignored, the remote
>>> system resets the connection after a number of retries.
>>>
>>> I have replicated this problem on both Ubuntu 10.04 with a 2.6.32-32-server
>>> kernel (x86_64) and CentOS 5.6 with a 2.6.18-238.12.1.el5 kernel. I'm sorry that
>>> I haven't had a chance to try to replicate the bug with a newer kernel, though a
>>> co-worker has looked through changelogs for more recent kernels and didn't find
>>> anything that looked relevant.
>>>
>>> From either of these hosts I run an application that connects to a remote host
>>> for 2-3 minutes, and that for most of that time sends no application data back
>>> and forth. After 30 seconds of no data from the Linux host, the remote host
>>> sends a garden variety keepalive. When the remote host includes tcp timestamps
>>> in the keepalives, they are ignored by the Linux host, and the remote host
>>> resets the connection after 10 unACKed keepalives. When timestamps are absent
>>> from the keepalives, the Linux host ACKs each one, and all is copacetic.
>>>
>>> Text output of a tcpdump trace of a connection that fails:
>>> http://pastebin.com/v6CpteJ9
>>>
>>> Text output of a tcpdump trace of a connection that succeeds:
>>> http://pastebin.com/KVLb3Mzh
>>>
>>> More details, in case you think they are relevant:
>>>
>>> My application creates a JDBC connection to a remote MS SQL Server and
>>> executes a statement that does not return a result set, and so it doesn't
>>> need to pass application data back and forth while it executes. The
>>> statement takes 2 or 3 minutes to complete. I connect to two different
>>> remote hosts: a Win2003 machine, and a Win2008R2 machine. The Win2003
>>> machine doesn't put timestamps in its keep-alives, so the application
>>> completes successfully when connecting to that host. If tcp timestamps
>>> are enabled on the Linux host, the Win2008 host includes them in its
>>> keepalives, and they are unACKed, so the connection is reset; if they
>>> are disabled on the Linux host, the Win2008 host doesn't include them in
>>> the keepalives, and the application completes successfully. I use (as
>>> you might expect) sysctl to disable tcp timestamps on the Linux hosts.
>>>
>>> I have dumps for all permutations of CentOS/Ubuntu, Win200[38], and +/-
>>> timestamps on the Linux side, and I will post them if the developers think that
>>> they would be useful.
>>
>> Hi Charles
>>
>> I could not reproduce the problem here, even using a quite old kernel as
>> receiver (2.6.9)
>>
>> 15:54:33.566192 IP 192.168.20.108.55926> 192.168.20.124.777: SWE
>> 479814493:479814493(0) win 14600<mss 1460,sackOK,timestamp 151666
>> 0,nop,wscale 7>
>> 15:54:33.566265 IP 192.168.20.124.777> 192.168.20.108.55926: S
>> 3714869381:3714869381(0) ack 479814494 win 5792<mss
>> 1460,sackOK,timestamp 54553041 151666,nop,wscale 2>
>> 15:54:33.566274 IP 192.168.20.108.55926> 192.168.20.124.777: . ack 1
>> win 115<nop,nop,timestamp 151666 54553041>
>> 15:54:33.566281 IP 192.168.20.108.55926> 192.168.20.124.777: P 1:5(4)
>> ack 1 win 115<nop,nop,timestamp 151666 54553041>
>> 15:54:33.566351 IP 192.168.20.124.777> 192.168.20.108.55926: . ack 5
>> win 1448<nop,nop,timestamp 54553041 151666>
>> 15:54:33.566375 IP 192.168.20.124.777> 192.168.20.108.55926: P 1:5(4)
>> ack 5 win 1448<nop,nop,timestamp 54553041 151666>
>> 15:54:33.566380 IP 192.168.20.108.55926> 192.168.20.124.777: . ack 5
>> win 115<nop,nop,timestamp 151666 54553041>
>> 15:54:43.577945 IP 192.168.20.108.55926> 192.168.20.124.777: . 4:5(1)
>> ack 5 win 115<nop,nop,timestamp 152668 54553041>
>> 15:54:43.578012 IP 192.168.20.124.777> 192.168.20.108.55926: . ack 5
>> win 1448<nop,nop,timestamp 54563053 152668,nop,nop,sack sack 1 {4:5}>
>> 15:54:53.597946 IP 192.168.20.108.55926> 192.168.20.124.777: . 4:5(1)
>> ack 5 win 115<nop,nop,timestamp 153670 54563053>
>> 15:54:53.598012 IP 192.168.20.124.777> 192.168.20.108.55926: . ack 5
>> win 1448<nop,nop,timestamp 54573073 153670,nop,nop,sack sack 1 {4:5}>
>>
>>
>> Are you sure frame tcp checksums are OK when the 'faulty' linux receive
>> them ? (tcpdump -v)
>
> I will check when I get into the office and let you know.
You are correct: the checksums in the keepalives are broken, though they are
correct in the other segments from the Win2008 server. I have updated the pastes
linked to above with 'tcpdump -v' output. I apologize for missing that problem
the first time around.
Chuck
^ permalink raw reply
* Re: TCP keepalives ignored by kernel when the contain timestamps
From: Eric Dumazet @ 2011-06-10 16:17 UTC (permalink / raw)
To: Charles Bearden; +Cc: netdev
In-Reply-To: <4DF24154.3060505@uth.tmc.edu>
Le vendredi 10 juin 2011 à 11:07 -0500, Charles Bearden a écrit :
> You are correct: the checksums in the keepalives are broken, though they are
> correct in the other segments from the Win2008 server. I have updated the pastes
> linked to above with 'tcpdump -v' output. I apologize for missing that problem
> the first time around.
Hmm, maybe its OK : If checksums are offloaded to NIC, tcpdump 'lies'
telling checksum is not OK, since tcpdump get a copy of the packet
before being handled by the NIC.
You should take a tcpdump on the receiving machine (the machine that
apparently doesnt react to the keepalive probes)
^ permalink raw reply
* Re: [PATCH 3/9] veth: convert to 64 bit statistics
From: Ben Hutchings @ 2011-06-10 16:34 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: David S. Miller, netdev
In-Reply-To: <20110608202241.52ce68ae@nehalam.ftrdhcpuser.net>
On Wed, 2011-06-08 at 20:22 -0700, Stephen Hemminger wrote:
> On Thu, 09 Jun 2011 03:00:41 +0100
> Ben Hutchings <bhutchings@solarflare.com> wrote:
>
> > On Wed, 2011-06-08 at 17:53 -0700, Stephen Hemminger wrote:
> > > Not much change, device was already keeping per cpu statistics.
> > > Use recent 64 statistics interface.
> > [...]
> >
> > It's also going to need to use u64_stats_sync functions.
> >
> > Ben.
> >
>
> No veth is doing per-cpu update therefore the new code has the same guarantee
> as the old code.
Per-cpu statistics avoid the problem of races between writers without
using expensive atomic operations. They don't solve the problem of
races between writer and reader (on 32-bit systems).
For example, if veth_get_stats() races with a receive or transmit that
increments a value from 0x00000001ffffffff to 0x0000000200000000, it may
see the value as 0x00000002ffffffff.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: TCP keepalives ignored by kernel when the contain timestamps
From: Charles Bearden @ 2011-06-10 16:39 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1307722621.4044.17.camel@edumazet-laptop>
[-- Attachment #1: Type: text/plain, Size: 990 bytes --]
On 06/10/2011 11:17 AM, Eric Dumazet wrote:
> Le vendredi 10 juin 2011 à 11:07 -0500, Charles Bearden a écrit :
>
>> You are correct: the checksums in the keepalives are broken, though they are
>> correct in the other segments from the Win2008 server. I have updated the pastes
>> linked to above with 'tcpdump -v' output. I apologize for missing that problem
>> the first time around.
>
> Hmm, maybe its OK : If checksums are offloaded to NIC, tcpdump 'lies'
> telling checksum is not OK, since tcpdump get a copy of the packet
> before being handled by the NIC.
>
> You should take a tcpdump on the receiving machine (the machine that
> apparently doesnt react to the keepalive probes)
I should have made this clear before: these dumps were captured on the machine
that is ignoring the keepalives ("Ubuntu.host" in the dumps) from the other host
("Win200[38].host" in the dumps). If I understand tcp checksum offloading
correctly, it wouldn't play a role here.
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5168 bytes --]
^ permalink raw reply
* Re.funds
From: Jean-Marie Degryse @ 2011-06-10 16:27 UTC (permalink / raw)
My associate has helped me to send your first payment
of $5000 USD to you as instructed by the Malaysian
Government and Mr. David Cameron the United Kingdom
prime minister after the last G20 meeting that was
held in Malaysia, making you one of the beneficiaries.
Here is the information below.
Refrence Numbers: 86147516
Sender Name Is = Patrick Lee Chun
I told him to keep sending you $5000 USD twice a week
until the FULL payment of ($820000.00 United State Dollars)
is completed.
A certificate will be made to change the Receivers Name
to your name as stated by the Malaysian Government,reconfirm
your {1}Full Names {2}address {3}Mobile Number
via Email to:money_gramlimited@mspil.edu.cn Allan Davis
to proceed.
Note:
You cannot pickup the money until the certificate is
obtained by you.
Regards
Mr. Allan Davis.
Tel: +(60) 163544376.
For more info: www.g20.org
^ permalink raw reply
* Re: KVM induced panic on 2.6.38[2367] & 2.6.39
From: Henrique de Moraes Holschuh @ 2011-06-10 16:43 UTC (permalink / raw)
To: Mark Lord
Cc: Simon Horman, Brad Campbell, Eric Dumazet, Patrick McHardy,
Bart De Schuymer, kvm, linux-mm, linux-kernel, netdev,
netfilter-devel
In-Reply-To: <4DF21002.3040708@teksavvy.com>
On Fri, 10 Jun 2011, Mark Lord wrote:
> Something many of us don't realize is that nearly all Intel chipsets
> have a built-in hardware watchdog timer. This includes chipset for
> consumer desktop boards as well as the big iron server stuff.
>
> It's the "i8xx_tco" driver in the kernel enables use of them:
That's the old module name, but yes, it is very useful in desktops and
laptops (when it works). Server-class hardware will have a baseboard
management unit that can really power-cycle the system instead of just
rebooting.
And test it first before you depend on it triggering at a remote location,
as the firmware might cause the Intel chipset watchdog to actually hang the
box instead of causing a proper reboot (happens on the IBM thinkpad T43, for
example).
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
^ permalink raw reply
* No traffic flow with Marvell 88E6095F using DSA
From: Barry G @ 2011-06-10 16:52 UTC (permalink / raw)
To: netdev, buytenh
Hello all,
We are trying to bring up a board with two Marvell 88E6095F chips and
one 88E6185 chip using the Distributed Switch Architecture (DSA)
facilities in the kernel.
We have a rough driver working, but we can't get traffic to flow in
or out of the Marvell chips.
To simplify the system, I am trying to bring up only the chip
attached to the CPU (88E6095F). I have a dsa_chip_data with
static struct dsa_chip_data switches[] = {
{
.sw_addr = 0x9,
.port_names = {
"eth%d", //0
"eth%d", //1
"eth%d", //2
"eth%d", //3
"eth%d", //4
"eth%d", //5
"eth%d", //6
"eth%d", //7
NULL, //8
NULL, //9
"cpu", //10
},
}
};
port 8 and 9 are DSA interfaces to other chips.
The host CPU is an Freescale 8308 hooked up to TSEC1.
We are using the mdio bus from the processor to drive
the mdio on the Marells.
Using this configuration with the driver shell I posted
previously the system boots and I see:
Distributed Switch Architecture driver version 0.1
eth0[0]: detected a Marvell 88E6095/88E6095F switch
dsa slave smi: probed
All good. If I run ip link show I get:
# ip link show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: bond0: <BROADCAST,MULTICAST,MASTER> mtu 1500 qdisc noop
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:30:a7:aa:aa:01 brd ff:ff:ff:ff:ff:ff
4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:30:a7:aa:aa:02 brd ff:ff:ff:ff:ff:ff
5: teql0: <NOARP> mtu 1500 qdisc noop qlen 100
link/void
6: tunl0: <NOARP> mtu 1480 qdisc noop
link/ipip 0.0.0.0 brd 0.0.0.0
7: eth2@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
link/ether 00:30:a7:aa:aa:01 brd ff:ff:ff:ff:ff:ff
8: eth3@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
link/ether 00:30:a7:aa:aa:01 brd ff:ff:ff:ff:ff:ff
9: eth4@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
link/ether 00:30:a7:aa:aa:01 brd ff:ff:ff:ff:ff:ff
10: eth5@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
link/ether 00:30:a7:aa:aa:01 brd ff:ff:ff:ff:ff:ff
11: eth6@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
link/ether 00:30:a7:aa:aa:01 brd ff:ff:ff:ff:ff:ff
12: eth7@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
link/ether 00:30:a7:aa:aa:01 brd ff:ff:ff:ff:ff:ff
13: eth8@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
link/ether 00:30:a7:aa:aa:01 brd ff:ff:ff:ff:ff:ff
14: eth9@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
link/ether 00:30:a7:aa:aa:01 brd ff:ff:ff:ff:ff:ff
This is after running "ip link set dev eth0 up" and "ip link set dev eth9 up"
Eth9 is hooked up to a cable. Removing the cable results in the
"eth9: link down" message and plugging it back in gives the "eth9:
link up, 100 Mb/s, half duplex, flow control disabled".
So it looks like everything is stellar, but we can't get traffic
to leave the device. I configure eth9 with an address (192.168.1.50/24)
and I try to ping 192.168.1.25 I get the following with tcpdump on eth9:
20:23:11.682628 arp who-has 192.168.1.25 tell 192.168.1.50 (repeats)
And on eth0 I see:
20:23:54.692701 00:30:a7:aa:aa:01 (oui Unknown) > Broadcast, ethertype
Unknown (0x4038), length 46:
0x0000: 0000 0806 0001 0800 0604 0001 0030 a7aa .............0..
0x0010: aa01 c0a8 0132 0000 0000 0000 c0a8 0119 .....2..........
However if I hook up to the port with wireshark I see nothing leaving
the port. Likewise if I attempt to ping into the device from 192.168.1.25
I see nothing coming in on the tcpdump for eth0 or eth9.
Linux seems happy:
# ./sbin/mii-tool -v eth9
eth9: no autonegotiation, 100baseTx-HD, link ok
product info: vendor 00:50:43, model 8 rev 5
basic mode: autonegotiation enabled
basic status: autonegotiation complete, link ok
capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
link partner: 100baseTx-HD
# ethtool -S eth9
NIC statistics:
tx_packets: 1534
tx_bytes: 64428
rx_packets: 0
rx_bytes: 0
in_good_octets: 151080
in_bad_octets: 0
in_unicast: 0
in_broadcasts: 2035
in_multicasts: 78
in_pause: 0
in_undersize: 0
in_fragments: 0
in_oversize: 0
in_jabber: 0
in_rx_error: 0
in_fcs_error: 0
out_octets: 0
out_unicast: 0
out_broadcasts: 0
out_multicasts: 0
out_pause: 0
excessive: 0
collisions: 0
deferred: 0
single: 0
multiple: 0
out_fcs_error: 0
late: 0
hist_64bytes: 2045
hist_65_127bytes: 9
hist_128_255bytes: 16
hist_256_511bytes: 43
hist_512_1023bytes: 0
hist_1024_max_bytes: 0
#
# cat /proc/net/dev
Inter-| Receive | Transmit
face |bytes packets errs drop fifo frame compressed
multicast|bytes packets errs drop fifo colls carrier compressed
lo: 194432 1736 0 0 0 0 0 0
194432 1736 0 0 0 0 0 0
bond0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
eth0: 0 0 0 0 0 0 0 0
77326 1681 6 0 6 0 0 0
eth1: 531776 5611 0 349 0 0 0 0
814765 3305 0 0 0 0 0 0
teql0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
tunl0: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
eth2: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
eth3: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
eth4: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
eth5: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
eth6: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
eth7: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
eth8: 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
eth9: 0 0 0 0 0 0 0 0
70602 1681 0 0 0 0 0 0
This problem looks really similar to the thread at:
http://kerneltrap.org/mailarchive/linux-netdev/2009/3/7/5116114/thread
but I have the patch Gary presents already in the tree :-(
I am running 2.6.39-rc4+.
Any help or guidance would be appreciated! How can I
best approach debugging this issue?
Thanks,
Barry
^ permalink raw reply
* Re: TCP keepalives ignored by kernel when the contain timestamps
From: Eric Dumazet @ 2011-06-10 16:54 UTC (permalink / raw)
To: Charles Bearden; +Cc: netdev
In-Reply-To: <4DF248C5.9050004@uth.tmc.edu>
Le vendredi 10 juin 2011 à 11:39 -0500, Charles Bearden a écrit :
> On 06/10/2011 11:17 AM, Eric Dumazet wrote:
> > Le vendredi 10 juin 2011 à 11:07 -0500, Charles Bearden a écrit :
> >
> >> You are correct: the checksums in the keepalives are broken, though they are
> >> correct in the other segments from the Win2008 server. I have updated the pastes
> >> linked to above with 'tcpdump -v' output. I apologize for missing that problem
> >> the first time around.
> >
> > Hmm, maybe its OK : If checksums are offloaded to NIC, tcpdump 'lies'
> > telling checksum is not OK, since tcpdump get a copy of the packet
> > before being handled by the NIC.
> >
> > You should take a tcpdump on the receiving machine (the machine that
> > apparently doesnt react to the keepalive probes)
>
> I should have made this clear before: these dumps were captured on the machine
> that is ignoring the keepalives ("Ubuntu.host" in the dumps) from the other host
> ("Win200[38].host" in the dumps). If I understand tcp checksum offloading
> correctly, it wouldn't play a role here.
>
OK, thanks.
I am curious, could you check the payload byte included in the keepalive
probes has the value of last sent byte on the session ?
(linux send keepalive probes with no data, so implementing this
'feature' would need to keep track of this value in tcp socket)
^ permalink raw reply
* Re: [PATCH v3] vlan: Fix the ingress VLAN_FLAG_REORDER_HDR check
From: Jiri Pirko @ 2011-06-10 16:56 UTC (permalink / raw)
To: Changli Gao
Cc: David Miller, pratnakarlx, ebiederm, shemminger, greearb,
nicolas.2p.debian, netdev, kaber, fubar, eric.dumazet, andy,
jesse
In-Reply-To: <BANLkTinctdoyYJo+sU9qUWV8uZV8yGNdnw@mail.gmail.com>
This time heavily based on Eric's V2. mac_len is reset at appropriate
places. Also skb->data is adjusted to point to beginning of mac header
before calling vlan_insert_tag.
Please review (fingers crossed).
Subject: [patch net-2.6 v2.1] vlan: Fix the ingress VLAN_FLAG_REORDER_HDR check
Testing of VLAN_FLAG_REORDER_HDR does not belong in vlan_untag
but rather in vlan_do_receive. Otherwise the vlan header
will not be properly put on the packet in the case of
vlan header accelleration.
As we remove the check from vlan_check_reorder_header
rename it vlan_reorder_header to keep the naming clean.
Fix up the skb->pkt_type early so we don't look at the packet
after adding the vlan tag, which guarantees we don't goof
and look at the wrong field.
Use a simple if statement instead of a complicated switch
statement to decided that we need to increment rx_stats
for a multicast packet.
Hopefully at somepoint we will just declare the case where
VLAN_FLAG_REORDER_HDR is cleared as unsupported and remove
the code. Until then this keeps it working correctly.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
---
include/linux/if_vlan.h | 25 +++++++++++++++++--
include/linux/skbuff.h | 5 ++++
net/8021q/vlan_core.c | 60 +++++++++++++++++++++++++---------------------
net/core/dev.c | 2 +-
4 files changed, 61 insertions(+), 31 deletions(-)
diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index dc01681..affa273 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -225,7 +225,7 @@ static inline int vlan_hwaccel_receive_skb(struct sk_buff *skb,
}
/**
- * __vlan_put_tag - regular VLAN tag inserting
+ * vlan_insert_tag - regular VLAN tag inserting
* @skb: skbuff to tag
* @vlan_tci: VLAN TCI to insert
*
@@ -234,8 +234,10 @@ static inline int vlan_hwaccel_receive_skb(struct sk_buff *skb,
*
* Following the skb_unshare() example, in case of error, the calling function
* doesn't have to worry about freeing the original skb.
+ *
+ * Does not change skb->protocol so this function can be used during receive.
*/
-static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
+static inline struct sk_buff *vlan_insert_tag(struct sk_buff *skb, u16 vlan_tci)
{
struct vlan_ethhdr *veth;
@@ -255,8 +257,25 @@ static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
/* now, the TCI */
veth->h_vlan_TCI = htons(vlan_tci);
- skb->protocol = htons(ETH_P_8021Q);
+ return skb;
+}
+/**
+ * __vlan_put_tag - regular VLAN tag inserting
+ * @skb: skbuff to tag
+ * @vlan_tci: VLAN TCI to insert
+ *
+ * Inserts the VLAN tag into @skb as part of the payload
+ * Returns a VLAN tagged skb. If a new skb is created, @skb is freed.
+ *
+ * Following the skb_unshare() example, in case of error, the calling function
+ * doesn't have to worry about freeing the original skb.
+ */
+static inline struct sk_buff *__vlan_put_tag(struct sk_buff *skb, u16 vlan_tci)
+{
+ skb = vlan_insert_tag(skb, vlan_tci);
+ if (skb)
+ skb->protocol = htons(ETH_P_8021Q);
return skb;
}
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index e8b78ce..c0a4f3a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1256,6 +1256,11 @@ static inline void skb_reserve(struct sk_buff *skb, int len)
skb->tail += len;
}
+static inline void skb_reset_mac_len(struct sk_buff *skb)
+{
+ skb->mac_len = skb->network_header - skb->mac_header;
+}
+
#ifdef NET_SKBUFF_DATA_USES_OFFSET
static inline unsigned char *skb_transport_header(const struct sk_buff *skb)
{
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 41495dc..fcc6846 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -23,6 +23,31 @@ bool vlan_do_receive(struct sk_buff **skbp)
return false;
skb->dev = vlan_dev;
+ if (skb->pkt_type == PACKET_OTHERHOST) {
+ /* Our lower layer thinks this is not local, let's make sure.
+ * This allows the VLAN to have a different MAC than the
+ * underlying device, and still route correctly. */
+ if (!compare_ether_addr(eth_hdr(skb)->h_dest,
+ vlan_dev->dev_addr))
+ skb->pkt_type = PACKET_HOST;
+ }
+
+ if (!(vlan_dev_info(vlan_dev)->flags & VLAN_FLAG_REORDER_HDR)) {
+ unsigned int offset = skb->data - skb_mac_header(skb);
+
+ /*
+ * vlan_insert_tag expect skb->data pointing to mac header.
+ * So change skb->data before calling it and change back to
+ * original position later
+ */
+ skb_push(skb, offset);
+ skb = *skbp = vlan_insert_tag(skb, skb->vlan_tci);
+ if (!skb)
+ return false;
+ skb_pull(skb, offset + VLAN_HLEN);
+ skb_reset_mac_len(skb);
+ }
+
skb->priority = vlan_get_ingress_priority(vlan_dev, skb->vlan_tci);
skb->vlan_tci = 0;
@@ -31,22 +56,8 @@ bool vlan_do_receive(struct sk_buff **skbp)
u64_stats_update_begin(&rx_stats->syncp);
rx_stats->rx_packets++;
rx_stats->rx_bytes += skb->len;
-
- switch (skb->pkt_type) {
- case PACKET_BROADCAST:
- break;
- case PACKET_MULTICAST:
+ if (skb->pkt_type == PACKET_MULTICAST)
rx_stats->rx_multicast++;
- break;
- case PACKET_OTHERHOST:
- /* Our lower layer thinks this is not local, let's make sure.
- * This allows the VLAN to have a different MAC than the
- * underlying device, and still route correctly. */
- if (!compare_ether_addr(eth_hdr(skb)->h_dest,
- vlan_dev->dev_addr))
- skb->pkt_type = PACKET_HOST;
- break;
- }
u64_stats_update_end(&rx_stats->syncp);
return true;
@@ -89,18 +100,13 @@ gro_result_t vlan_gro_frags(struct napi_struct *napi, struct vlan_group *grp,
}
EXPORT_SYMBOL(vlan_gro_frags);
-static struct sk_buff *vlan_check_reorder_header(struct sk_buff *skb)
+static struct sk_buff *vlan_reorder_header(struct sk_buff *skb)
{
- if (vlan_dev_info(skb->dev)->flags & VLAN_FLAG_REORDER_HDR) {
- if (skb_cow(skb, skb_headroom(skb)) < 0)
- skb = NULL;
- if (skb) {
- /* Lifted from Gleb's VLAN code... */
- memmove(skb->data - ETH_HLEN,
- skb->data - VLAN_ETH_HLEN, 12);
- skb->mac_header += VLAN_HLEN;
- }
- }
+ if (skb_cow(skb, skb_headroom(skb)) < 0)
+ return NULL;
+ memmove(skb->data - ETH_HLEN, skb->data - VLAN_ETH_HLEN, 2 * ETH_ALEN);
+ skb->mac_header += VLAN_HLEN;
+ skb_reset_mac_len(skb);
return skb;
}
@@ -161,7 +167,7 @@ struct sk_buff *vlan_untag(struct sk_buff *skb)
skb_pull_rcsum(skb, VLAN_HLEN);
vlan_set_encap_proto(skb, vhdr);
- skb = vlan_check_reorder_header(skb);
+ skb = vlan_reorder_header(skb);
if (unlikely(!skb))
goto err_free;
diff --git a/net/core/dev.c b/net/core/dev.c
index a54c9f8..9c58c1e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3114,7 +3114,7 @@ static int __netif_receive_skb(struct sk_buff *skb)
skb_reset_network_header(skb);
skb_reset_transport_header(skb);
- skb->mac_len = skb->network_header - skb->mac_header;
+ skb_reset_mac_len(skb);
pt_prev = NULL;
--
1.7.5.2
^ permalink raw reply related
* Re: [PATCH v2 2/5] net: ep93xx_eth: pass struct device to DMA API functions
From: Mika Westerberg @ 2011-06-10 16:55 UTC (permalink / raw)
To: H Hartley Sweeten
Cc: netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
kernel@wantstofly.org, rmallon
In-Reply-To: <ADE657CA350FB648AAC2C43247A983F001F3820B8699@AUSP01VMBX24.collaborationhost.net>
On Thu, Jun 09, 2011 at 06:40:21PM -0500, H Hartley Sweeten wrote:
>
> I just noticed this macro in include/linux/netdevice.h
>
> /* Set the sysfs physical device reference for the network logical device
> * if set prior to registration will cause a symlink during initialization.
> */
> #define SET_NETDEV_DEV(net, pdev) ((net)->dev.parent = (pdev))
>
> Is there anyway you could use that macro in the probe to save the platform_device
> (with it's associated device) instead of introducing a new struct device * in the
> private data?
Nice finding, thanks.
I'll look into that and send new version of the whole series soon.
^ permalink raw reply
* Re: [PATCH v2 1/5] ep93xx: set DMA masks for the ep93xx_eth
From: Mika Westerberg @ 2011-06-10 16:56 UTC (permalink / raw)
To: Petr Štetiar; +Cc: netdev, kernel, hsweeten, ryan, davem, linux-arm-kernel
In-Reply-To: <20110610154736.GA16318@ibawizard.net>
On Fri, Jun 10, 2011 at 05:47:36PM +0200, Petr Štetiar wrote:
>
> I've tested the whole series and didn't find any problems so far, thanks. You
> can add my
>
> Tested-by: Petr Štetiar <ynezz@true.cz>
>
> if you wish.
Cool. Thanks for testing :)
^ permalink raw reply
* Re: [PATCH net-next-2.6] inetpeer: lower false sharing effect
From: Tim Chen @ 2011-06-10 17:05 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev, Andi Kleen
In-Reply-To: <1307680287.3210.2.camel@edumazet-laptop>
On Fri, 2011-06-10 at 06:31 +0200, Eric Dumazet wrote:
>
> Thanks Tim
>
> I have some questions for further optimizations.
>
> 1) How many different destinations are used in your stress load ?
> 2) Could you provide a distribution of the size of packet lengthes ?
> Or maybe the average length would be OK
>
>
>
Actually I have one load generator and one server connected to each
other via a 10Gb link.
The server is a 40 core 4 socket Westmere-EX machine and the load
generator is a 12 core 2 socket Westmere-EP machine.
There are 40 memcached daemons on the server each bound to a cpu core
and listening on a distinctive UDP port. The load generator has 40
threads, with each thread sending memcache request to a particular UDP
port.
The load generator's memcache request packet has a UDP payload of 25
bytes. The response packet from the daemon has a UDP payload of 13
bytes.
The UPD packets on the load generator and server are distributed across
16 Tx-Rx queues by hashing on the UDP ports (with slight modification of
the hash flags of ixgbe).
Thanks.
Tim
^ permalink raw reply
* Re: [PATCH net-next-2.6] inetpeer: lower false sharing effect
From: Eric Dumazet @ 2011-06-10 17:17 UTC (permalink / raw)
To: Tim Chen; +Cc: David Miller, netdev, Andi Kleen
In-Reply-To: <1307725531.17300.58.camel@schen9-DESK>
Le vendredi 10 juin 2011 à 10:05 -0700, Tim Chen a écrit :
> On Fri, 2011-06-10 at 06:31 +0200, Eric Dumazet wrote:
>
> >
> > Thanks Tim
> >
> > I have some questions for further optimizations.
> >
> > 1) How many different destinations are used in your stress load ?
> > 2) Could you provide a distribution of the size of packet lengthes ?
> > Or maybe the average length would be OK
> >
> >
> >
>
> Actually I have one load generator and one server connected to each
> other via a 10Gb link.
>
> The server is a 40 core 4 socket Westmere-EX machine and the load
> generator is a 12 core 2 socket Westmere-EP machine.
>
> There are 40 memcached daemons on the server each bound to a cpu core
> and listening on a distinctive UDP port. The load generator has 40
> threads, with each thread sending memcache request to a particular UDP
> port.
>
>
> The load generator's memcache request packet has a UDP payload of 25
> bytes. The response packet from the daemon has a UDP payload of 13
> bytes.
>
> The UPD packets on the load generator and server are distributed across
> 16 Tx-Rx queues by hashing on the UDP ports (with slight modification of
> the hash flags of ixgbe).
Excellent, thanks for all these details.
I had the idea some weeks ago to add a fast path to udp_sendmsg() for
small messages [doing the user->kernel copy before RCU route lookup],
that should fit your workload (and other typical UDP workloads)
I'll try to cook patches in next days.
Thanks
^ permalink raw reply
* Re: [PATCH 01/10] net: introduce time stamping wrapper for netif_rx.
From: Richard Cochran @ 2011-06-10 17:27 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, David Miller
In-Reply-To: <20110610082056.5ab6b981@nehalam.ftrdhcpuser.net>
On Fri, Jun 10, 2011 at 08:20:56AM -0700, Stephen Hemminger wrote:
> On Fri, 10 Jun 2011 17:06:59 +0200
> Richard Cochran <richardcochran@gmail.com> wrote:
>
> > +static inline int netif_rx_defer(struct sk_buff *skb)
> > +{
> > + if (skb_defer_rx_timestamp(skb))
> > + return NET_RX_SUCCESS;
> > + return netif_rx(skb);
> > +}
>
> Obvious question why not just put this in netif_rx.
Well, if a packet gets defered, then that means that the PHY driver
has decided to hold the packet until it obtains the time stamp from
the PHY hardware. Then, the driver delivers the packet using netif_rx.
So, we need to have two methods to deliver a frame, one with and one
without the hook, otherwise you get packets going round in circles.
Take a look at the one PHY driver using this (so far), on line 1017 of
drivers/net/phy/dp83640.c, to see how it works.
Thanks,
Richard
PS I did consider at renaming netif_rx to __netif_rx and then
implementing netif_rx as shown above, but I found many, many callers
of netif_rx which are not drivers, so I worry that bad side effects
would appear from such a change.
^ permalink raw reply
* RE: [PATCH v2 2/5] net: ep93xx_eth: pass struct device to DMA API functions
From: H Hartley Sweeten @ 2011-06-10 17:30 UTC (permalink / raw)
To: Mika Westerberg
Cc: netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
kernel@wantstofly.org, rmallon@gmail.com
In-Reply-To: <20110610165545.GA2753@acer>
On Friday, June 10, 2011 9:56 AM, Mika Westerberg wrote:
> On Thu, Jun 09, 2011 at 06:40:21PM -0500, H Hartley Sweeten wrote:
>>
>> I just noticed this macro in include/linux/netdevice.h
>>
>> /* Set the sysfs physical device reference for the network logical device
>> * if set prior to registration will cause a symlink during initialization.
>> */
>> #define SET_NETDEV_DEV(net, pdev) ((net)->dev.parent = (pdev))
>>
>> Is there anyway you could use that macro in the probe to save the platform_device
>> (with it's associated device) instead of introducing a new struct device * in the
>> private data?
>
> Nice finding, thanks.
>
> I'll look into that and send new version of the whole series soon.
It looks like after doing this in the probe:
SET_NETDEV_DEV(dev, &pdev->dev);
You can then pass the required struct device pointer to the DMA API functions
like this:
static int ep93xx_rx(struct net_device *dev, int processed, int budget)
{
...
dma_sync_single_for_cpu(dev->dev.parent, ...
Regards,
Hartley
^ permalink raw reply
* Re: TCP keepalives ignored by kernel when the contain timestamps
From: Charles Bearden @ 2011-06-10 18:00 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1307724870.4044.20.camel@edumazet-laptop>
On 06/10/2011 11:54 AM, Eric Dumazet wrote:
> Le vendredi 10 juin 2011 à 11:39 -0500, Charles Bearden a écrit :
>> On 06/10/2011 11:17 AM, Eric Dumazet wrote:
>>> Le vendredi 10 juin 2011 à 11:07 -0500, Charles Bearden a écrit :
>>>
>>>> You are correct: the checksums in the keepalives are broken, though they are
>>>> correct in the other segments from the Win2008 server. I have updated the pastes
>>>> linked to above with 'tcpdump -v' output. I apologize for missing that problem
>>>> the first time around.
>>>
>>> Hmm, maybe its OK : If checksums are offloaded to NIC, tcpdump 'lies'
>>> telling checksum is not OK, since tcpdump get a copy of the packet
>>> before being handled by the NIC.
>>>
>>> You should take a tcpdump on the receiving machine (the machine that
>>> apparently doesnt react to the keepalive probes)
>>
>> I should have made this clear before: these dumps were captured on the machine
>> that is ignoring the keepalives ("Ubuntu.host" in the dumps) from the other host
>> ("Win200[38].host" in the dumps). If I understand tcp checksum offloading
>> correctly, it wouldn't play a role here.
>>
>
> OK, thanks.
>
> I am curious, could you check the payload byte included in the keepalive
> probes has the value of last sent byte on the session ?
>
> (linux send keepalive probes with no data, so implementing this
> 'feature' would need to keep track of this value in tcp socket)
Each keepalive from the Win2008 machine has a 1-byte payload 0x00. The last byte
of the last packet with a payload before that from the Win2008 host (at
14:40:18.166394 in the paste) is also 0x00. Is that what you were asking about?
^ permalink raw reply
* Re: TCP keepalives ignored by kernel when the contain timestamps
From: Eric Dumazet @ 2011-06-10 18:04 UTC (permalink / raw)
To: Charles Bearden; +Cc: netdev
In-Reply-To: <4DF25BA2.4040200@uth.tmc.edu>
Le vendredi 10 juin 2011 à 13:00 -0500, Charles Bearden a écrit :
> Each keepalive from the Win2008 machine has a 1-byte payload 0x00. The last byte
> of the last packet with a payload before that from the Win2008 host (at
> 14:40:18.166394 in the paste) is also 0x00. Is that what you were asking about?
>
Yes, thats fine, thanks :)
^ permalink raw reply
* Re: TCP keepalives ignored by kernel when the contain timestamps
From: Charles Bearden @ 2011-06-10 18:13 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1307729054.2872.0.camel@edumazet-laptop>
On 06/10/2011 01:04 PM, Eric Dumazet wrote:
> Le vendredi 10 juin 2011 à 13:00 -0500, Charles Bearden a écrit :
>
>> Each keepalive from the Win2008 machine has a 1-byte payload 0x00. The last byte
>> of the last packet with a payload before that from the Win2008 host (at
>> 14:40:18.166394 in the paste) is also 0x00. Is that what you were asking about?
>>
>
> Yes, thats fine, thanks :)
One other thing: when tcp timestamps are disabled on the Linux (receiving) end,
so that the Win2008 host sends keepalives without timestamps, the checksums in
the keepalives are correct. It's only when the keepalives also contain
timestamps that the checksums are broken.
If you think this might be a Linux kernel issue, I'll be glad to keep working
with you, but I don't want to spam this list if my problem isn't relevant. Thank
you for your help in any case.
Chuck
^ permalink raw reply
* Re: [PATCH 01/10] net: introduce time stamping wrapper for netif_rx.
From: Stephen Hemminger @ 2011-06-10 18:19 UTC (permalink / raw)
To: Richard Cochran; +Cc: netdev, David Miller
In-Reply-To: <20110610172747.GA2925@riccoc20.at.omicron.at>
On Fri, 10 Jun 2011 19:27:47 +0200
Richard Cochran <richardcochran@gmail.com> wrote:
> On Fri, Jun 10, 2011 at 08:20:56AM -0700, Stephen Hemminger wrote:
> > On Fri, 10 Jun 2011 17:06:59 +0200
> > Richard Cochran <richardcochran@gmail.com> wrote:
> >
> > > +static inline int netif_rx_defer(struct sk_buff *skb)
> > > +{
> > > + if (skb_defer_rx_timestamp(skb))
> > > + return NET_RX_SUCCESS;
> > > + return netif_rx(skb);
> > > +}
> >
> > Obvious question why not just put this in netif_rx.
>
> Well, if a packet gets defered, then that means that the PHY driver
> has decided to hold the packet until it obtains the time stamp from
> the PHY hardware. Then, the driver delivers the packet using netif_rx.
>
> So, we need to have two methods to deliver a frame, one with and one
> without the hook, otherwise you get packets going round in circles.
>
> Take a look at the one PHY driver using this (so far), on line 1017 of
> drivers/net/phy/dp83640.c, to see how it works.
>
> Thanks,
> Richard
>
> PS I did consider at renaming netif_rx to __netif_rx and then
> implementing netif_rx as shown above, but I found many, many callers
> of netif_rx which are not drivers, so I worry that bad side effects
> would appear from such a change.
Why not use a timestamp present flag like the receive hashing code
already does.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox