netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "George Spelvin" <linux@horizon.com>
To: romieu@fr.zoreil.com
Cc: linux@horizon.com, netdev@vger.kernel.org
Subject: Re: v3.5: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
Date: 1 Aug 2012 19:29:53 -0400	[thread overview]
Message-ID: <20120801232953.3791.qmail@science.horizon.com> (raw)
In-Reply-To: <20120801192455.GA29755@electric-eye.fr.zoreil.com>

Thank you for the response!

> It's up to you but I suggest that you keep them until there is something
> better.

I was going to; I just wondered if they interfered with debugging or
something.

> As long as the device recovers, you may try and lower the watchdog timeout
> as well as increase the Tx ring size a bit (x2 or x4) to minimize the
> annoyances.

Out of curiosity, how does increasing the Tx ring size help?

But okay.  Just to make sure I'm doing it right (I'm pretty sure,
but scream if I'm making a mistake), I'm making the following edits to
drivers/net/ethernet/realtek/r8169.c

#define	NUM_TX_DESC	64	/* Number of Tx descriptor registers */

I'll double that to 128.

Now, since I am actually running at gigabit speed into a pretty capable
network that I don't expect to ever block me, I should be able to send
one 1500-byte frame in 12.3 microseconds (with all overhead, one 1500-byte
frame is 1538 bytes or 12304 bits), so 128 frames in 1.6 ms.

There is the issue of TSO, so one descriptor might send more than one
frame, but I think it's likely to break at 4K pages, the worst case is
128 * 4096 / 1500 = 350 frames in that Tx ring, which will take 4.3 ms.

Either way, I can drop the Tx timeout a *lot*.

#define	TL8169_TX_TIMEOUT	(6*HZ)

I want to drop that to HZ/100 or less.  Since I'm currently running with
CONFIG_HZ_100, and I'm not sure about the rounding (do I gain or lose
one tick due to ambiguity?) I'll bump HZ to 300 and change that to HZ/100.
That should give me a minimum of 2 ticks = 6.666 ms, which is still more
than it should take to transmit a full 

To make this short timeout actually work, I have to remove the "round
to nearest second" round_timer() calls in net/sched/sch_generic.c (there
are two that apply to dev->watchdog_timer), since I do want a sub-second
timeout granularity.

  reply	other threads:[~2012-08-01 23:29 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-30 19:02 v3.5: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out George Spelvin
2012-08-01 19:24 ` Francois Romieu
2012-08-01 23:29   ` George Spelvin [this message]
2012-08-02 13:32     ` George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120801232953.3791.qmail@science.horizon.com \
    --to=linux@horizon.com \
    --cc=netdev@vger.kernel.org \
    --cc=romieu@fr.zoreil.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).