* r8169 tx problem (1s pause with ping)
@ 2007-06-13 0:41 Benjamin LaHaise
2007-06-13 21:18 ` Francois Romieu
0 siblings, 1 reply; 9+ messages in thread
From: Benjamin LaHaise @ 2007-06-13 0:41 UTC (permalink / raw)
To: netdev
Hello folks,
I'm seeing something odd with r8169 on FC7: doing a ping -s 1600 alternates
between a 1s latency and sub 1ms. Has anyone else seen anything like this?
The system in question is an Asus M2A-VM with an onboard RTL8111 (I think).
NAPI doesn't seem to make a difference. The kernel in question is currently
a vanilla 2.6.21.5. Sub-mtu sized packets behave normally.
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
PING 1.2.3.4 (1.2.3.4) 1600(1628) bytes of data.
1608 bytes from 1.2.3.4: icmp_seq=1 ttl=64 time=1000 ms
1608 bytes from 1.2.3.4: icmp_seq=2 ttl=64 time=0.816 ms
1608 bytes from 1.2.3.4: icmp_seq=3 ttl=64 time=1000 ms
1608 bytes from 1.2.3.4: icmp_seq=4 ttl=64 time=0.661 ms
-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <zyntrop@kvack.org>.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: r8169 tx problem (1s pause with ping)
2007-06-13 0:41 r8169 tx problem (1s pause with ping) Benjamin LaHaise
@ 2007-06-13 21:18 ` Francois Romieu
2007-06-14 15:33 ` David Gundersen
0 siblings, 1 reply; 9+ messages in thread
From: Francois Romieu @ 2007-06-13 21:18 UTC (permalink / raw)
To: Benjamin LaHaise; +Cc: netdev
Benjamin LaHaise <bcrl@kvack.org> :
[...]
> I'm seeing something odd with r8169 on FC7: doing a ping -s 1600 alternates
> between a 1s latency and sub 1ms. Has anyone else seen anything like this?
> The system in question is an Asus M2A-VM with an onboard RTL8111 (I think).
> NAPI doesn't seem to make a difference. The kernel in question is currently
> a vanilla 2.6.21.5. Sub-mtu sized packets behave normally.
Same thing here for my 8168 rev 01 (asrock 945G dvi LOM) with 2.6.22-rc4
and 2.6.22-rc3 + r816x patchkit.
Wonderful.
--
Ueimor
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: r8169 tx problem (1s pause with ping)
2007-06-13 21:18 ` Francois Romieu
@ 2007-06-14 15:33 ` David Gundersen
2007-06-14 22:00 ` Francois Romieu
2007-06-18 15:14 ` Benjamin LaHaise
0 siblings, 2 replies; 9+ messages in thread
From: David Gundersen @ 2007-06-14 15:33 UTC (permalink / raw)
To: netdev
[-- Attachment #1: Type: text/plain, Size: 1717 bytes --]
Francois Romieu wrote:
> Benjamin LaHaise <bcrl@kvack.org> :
> [...]
>> I'm seeing something odd with r8169 on FC7: doing a ping -s 1600 alternates
>> between a 1s latency and sub 1ms. Has anyone else seen anything like this?
>> The system in question is an Asus M2A-VM with an onboard RTL8111 (I think).
>> NAPI doesn't seem to make a difference. The kernel in question is currently
>> a vanilla 2.6.21.5. Sub-mtu sized packets behave normally.
>
> Same thing here for my 8168 rev 01 (asrock 945G dvi LOM) with 2.6.22-rc4
> and 2.6.22-rc3 + r816x patchkit.
>
> Wonderful.
>
I've got a modified version of the latest realtek (r8168) driver running
here that doesn't seem to exhibit those symptoms.
The trouble I have is that I've been playing with multiple sections of
the code and I'm not 100% sure what part(s) might impact that particular
test. The bits I know I've messed with are the bits that set the
First/Last fragment flags and the NPQ flagging section (as described in
my previous emails).
I know it's not particularly scientific of me to be changing multiple
sections of the driver at once and I'm sorry about that but it's fairly
late here and I really should get some sleep :). I might do some more
thorough testing on the weekend to find out what the minimal changes
required are to get things working.
In the mean-time I'll attach my patch for the r8168-8.001.00 realtek
driver here in case anybody else wants to have a play with it and see if
it helps them out.
Also, It might be a silly question, but have you tried taking packet
captures from the perspective of the box with the realtek chipset &
another box during the pinging and comparing the two?
Regards,
Dave.
[-- Attachment #2: r8168-8.001.00-dg.patch.gz --]
[-- Type: application/x-gzip, Size: 2237 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: r8169 tx problem (1s pause with ping)
2007-06-14 15:33 ` David Gundersen
@ 2007-06-14 22:00 ` Francois Romieu
2007-06-19 23:06 ` Francois Romieu
2007-06-18 15:14 ` Benjamin LaHaise
1 sibling, 1 reply; 9+ messages in thread
From: Francois Romieu @ 2007-06-14 22:00 UTC (permalink / raw)
To: David Gundersen; +Cc: netdev, Benjamin LaHaise
David Gundersen <gundy@iinet.net.au> :
[...]
> I might do some more thorough testing on the weekend to find out what
> the minimal changes required are to get things working.
In your patch, the first chunk of data (which is handed back to the
nic outside of rtl8169_xmit_frags) will not have is First fragment
bit set when nr_frags != 0. It makes me a bit nervous.
The NPQ part of your patch actually forces the start_xmit handler to
wait for all previously queued packets to be sent before telling the
nic that a packet may be available. I can understand that it will fix
the symptoms but it will implicitly shorten the TX queue a lot.
A pesky fever apart, I should be able to come with a patch before
the week end.
--
Ueimor
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: r8169 tx problem (1s pause with ping)
2007-06-14 15:33 ` David Gundersen
2007-06-14 22:00 ` Francois Romieu
@ 2007-06-18 15:14 ` Benjamin LaHaise
2007-06-19 10:45 ` David Gundersen
1 sibling, 1 reply; 9+ messages in thread
From: Benjamin LaHaise @ 2007-06-18 15:14 UTC (permalink / raw)
To: David Gundersen; +Cc: netdev
On Fri, Jun 15, 2007 at 01:33:14AM +1000, David Gundersen wrote:
> In the mean-time I'll attach my patch for the r8168-8.001.00 realtek
> driver here in case anybody else wants to have a play with it and see if
> it helps them out.
Out of curiousity, does it work if you just do a single read (ie
RTL_R8(TxPoll);) of the register before writing to it? That would clear
things up if it is a PCI posting problem.
-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <zyntrop@kvack.org>.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: r8169 tx problem (1s pause with ping)
2007-06-18 15:14 ` Benjamin LaHaise
@ 2007-06-19 10:45 ` David Gundersen
0 siblings, 0 replies; 9+ messages in thread
From: David Gundersen @ 2007-06-19 10:45 UTC (permalink / raw)
To: Benjamin LaHaise; +Cc: netdev
> Out of curiousity, does it work if you just do a single read (ie
> RTL_R8(TxPoll);) of the register before writing to it? That would clear
> things up if it is a PCI posting problem.
Hi Ben,
I tried your suggestion but it didn't seem to make any difference :(
I tried the following combinations:
- realtek original [broken]
- realtek original with the RTL_R8(TxPoll)
before RTL_W8(TxPoll, NPQ); [broken]
- my patched version without the ndelay
loop but including the RTL_R8(TxPoll)
(to see if my messing with the frag logic
was having any impact) [broken]
- my patched version including the
ndelay loop [full speed transfers]
Also, I'm not sure if I made it clear in my first post, but I'm testing
these changes on a 8168B (it's built in to my GA-945G-S3 motherboard).
I'm not sure if we can assume that the same change applied to the 8169
driver would have the same effect on the 8169 too? (is the 8168 just a
PCI express version of the [pci] 8169?)
Dave.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: r8169 tx problem (1s pause with ping)
2007-06-14 22:00 ` Francois Romieu
@ 2007-06-19 23:06 ` Francois Romieu
2007-06-20 8:57 ` David Gundersen
0 siblings, 1 reply; 9+ messages in thread
From: Francois Romieu @ 2007-06-19 23:06 UTC (permalink / raw)
To: David Gundersen; +Cc: netdev, Benjamin LaHaise
Francois Romieu <romieu@fr.zoreil.com> :
[...]
> A pesky fever apart, I should be able to come with a patch before
> the week end.
(oops, late again)
You can try the hack below as a temporary fix.
It makes things way better on top of 2.6.22-rc5 + current r8169 patchkit at
http://www.fr.zoreil.com/people/francois/misc/20070619-2.6.22-rc5-r8169-test.patch
It applies against 2.6.22-rc5 too.
I should come with something else soon.
diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 8f3e0da..8c0851f 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -2682,6 +2688,8 @@ static void rtl8169_tx_interrupt(struct net_device *dev,
(TX_BUFFS_AVAIL(tp) >= MAX_SKB_FRAGS)) {
netif_wake_queue(dev);
}
+ if (tp->dirty_tx != tp->cur_rx)
+ RTL_W8(TxPoll, NPQ);
}
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: r8169 tx problem (1s pause with ping)
2007-06-19 23:06 ` Francois Romieu
@ 2007-06-20 8:57 ` David Gundersen
2007-06-20 21:15 ` Francois Romieu
0 siblings, 1 reply; 9+ messages in thread
From: David Gundersen @ 2007-06-20 8:57 UTC (permalink / raw)
To: netdev; +Cc: Francois Romieu, Benjamin LaHaise
> diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
> index 8f3e0da..8c0851f 100644
> --- a/drivers/net/r8169.c
> +++ b/drivers/net/r8169.c
> @@ -2682,6 +2688,8 @@ static void rtl8169_tx_interrupt(struct net_device *dev,
> (TX_BUFFS_AVAIL(tp) >= MAX_SKB_FRAGS)) {
> netif_wake_queue(dev);
> }
> + if (tp->dirty_tx != tp->cur_rx)
> + RTL_W8(TxPoll, NPQ);
> }
> }
>
Hi Francois,
A few things:
- Why are you checking dirty_tx against cur_rx (shouldn't it be cur_tx?)?
- Is there a possibility that the driver could be triggering the card to
send invalid packets with that code?
I'm thinking in _start_xmit, the cur_tx pointer (assuming that's what
you meant to include above) gets incremented when the packet is sent to
the card (the RTL_W8(TxPoll,NPQ)) to indicate that the card _should_ be
able to process packets up to that point in the queue.
The interrupt routine comes along later to clean up the buffers between
dirty_tx (the last packet that the driver knows the card has sent) and
cur_tx (the point that the card could potentially be up to). What
seems could happen with the code above is that the card gets told that a
packet is ready to be sent when really it's not. I'm not sure how it
would deal with such a situation.
Regards,
Dave.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: r8169 tx problem (1s pause with ping)
2007-06-20 8:57 ` David Gundersen
@ 2007-06-20 21:15 ` Francois Romieu
0 siblings, 0 replies; 9+ messages in thread
From: Francois Romieu @ 2007-06-20 21:15 UTC (permalink / raw)
To: David Gundersen; +Cc: netdev, Benjamin LaHaise
David Gundersen <gundy@iinet.net.au> :
[...]
> - Why are you checking dirty_tx against cur_rx (shouldn't it be cur_tx?)?
Usual "try something, send something else" bug.
> - Is there a possibility that the driver could be triggering the card to
> send invalid packets with that code?
[snip]
I do not know but I doubt that it matters.
That being said, the patch below should make things better (though still
not perfect):
diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 8f3e0da..35054e8 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -2514,7 +2514,7 @@ static inline u32 rtl8169_tso_csum(struct sk_buff *skb, struct net_device *dev)
static int rtl8169_start_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct rtl8169_private *tp = netdev_priv(dev);
- unsigned int frags, entry = tp->cur_tx % NUM_TX_DESC;
+ unsigned int frags, cur_tx, entry = tp->cur_tx % NUM_TX_DESC;
struct TxDesc *txd = tp->TxDescArray + entry;
void __iomem *ioaddr = tp->mmio_addr;
dma_addr_t mapping;
@@ -2567,13 +2567,17 @@ static int rtl8169_start_xmit(struct sk_buff *skb, struct net_device *dev)
dev->trans_start = jiffies;
+ cur_tx = tp->cur_tx;
+
tp->cur_tx += frags + 1;
- smp_wmb();
+ mmiowb();
- RTL_W8(TxPoll, NPQ); /* set polling bit */
+ smp_mb();
- if (TX_BUFFS_AVAIL(tp) < MAX_SKB_FRAGS) {
+ if (cur_tx == tp->dirty_tx)
+ RTL_W8(TxPoll, NPQ);
+ else if (TX_BUFFS_AVAIL(tp) < MAX_SKB_FRAGS) {
netif_stop_queue(dev);
smp_rmb();
if (TX_BUFFS_AVAIL(tp) >= MAX_SKB_FRAGS)
@@ -2677,10 +2681,18 @@ static void rtl8169_tx_interrupt(struct net_device *dev,
if (tp->dirty_tx != dirty_tx) {
tp->dirty_tx = dirty_tx;
- smp_wmb();
- if (netif_queue_stopped(dev) &&
- (TX_BUFFS_AVAIL(tp) >= MAX_SKB_FRAGS)) {
- netif_wake_queue(dev);
+ smp_mb();
+ if (unlikely(netif_queue_stopped(dev))) {
+ netif_tx_lock(dev);
+ if (TX_BUFFS_AVAIL(tp) >= MAX_SKB_FRAGS)
+ netif_wake_queue(dev);
+ if (dirty_tx != tp->cur_tx)
+ RTL_W8(TxPoll, NPQ);
+ netif_tx_unlock(dev);
+ } else if (dirty_tx != tp->cur_tx) {
+ netif_tx_lock(dev);
+ RTL_W8(TxPoll, NPQ);
+ netif_tx_unlock(dev);
}
}
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-06-20 21:17 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-13 0:41 r8169 tx problem (1s pause with ping) Benjamin LaHaise
2007-06-13 21:18 ` Francois Romieu
2007-06-14 15:33 ` David Gundersen
2007-06-14 22:00 ` Francois Romieu
2007-06-19 23:06 ` Francois Romieu
2007-06-20 8:57 ` David Gundersen
2007-06-20 21:15 ` Francois Romieu
2007-06-18 15:14 ` Benjamin LaHaise
2007-06-19 10:45 ` David Gundersen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).