From: Jeff Garzik <jeff@garzik.org>
To: Brice Goglin <brice@myri.com>
Cc: netdev@vger.kernel.org
Subject: Re: [PATCH 2/3] myri10ge: limit the number of recoveries
Date: Sun, 03 Jun 2007 11:55:45 -0400 [thread overview]
Message-ID: <4662E481.4030508@garzik.org> (raw)
In-Reply-To: <465DCD16.8050907@myri.com>
Brice Goglin wrote:
> Limit the number of recoveries from a NIC hw watchdog reset to 1 by default.
> It enables detection of defective NICs immediately since these memory parity
> errors are expected to happen very rarely (less than once per century*NIC).
> However, a defective NIC (very rare, fortunately) can see such an error
> quite often, ie. every few minutes under high load.
>
> Make the limit tunable to allow people with mission critical installations
> to crank up the tunable and recover an INTMAX number of times while waiting
> for a downtime window to replace the NIC. The performance won't be optimal,
> but at least, it will still work.
>
> Signed-off-by: Brice Goglin <brice@myri.com>
> ---
> drivers/net/myri10ge/myri10ge.c | 15 +++++++++++++--
> 1 file changed, 13 insertions(+), 2 deletions(-)
NAK. Random broken (unrelated to silicon errata) can happen in any
field installation, and manifest itself in any number of ways.
If defective NICs are truly rare, it does not sound worth adding this
workaround to the driver.
If I had to guess, this is a meaningless gesture (from a technical
standpoint) for a Big Customer(tm) who is currently throwing a
temper-tantrum... :)
By definition if you can give them a patch, and they can wait for the
driver patch going upstream, then it is not a "mission critical"
situation, otherwise they would already have the patch direct from you.
Also, "mission critical" tends to imply that you cannot remove the
driver from operation either, which is counter to the logic of patching
a driver for the problem.
Jeff
next prev parent reply other threads:[~2007-06-03 15:55 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-30 19:12 [PATCH 0/3] myri10ge updates for 2.6.22 Brice Goglin
2007-05-30 19:13 ` [PATCH 1/3] myri10ge: report link up/down in standard ethtool way Brice Goglin
2007-06-03 15:48 ` Jeff Garzik
2007-05-30 19:14 ` [PATCH 2/3] myri10ge: limit the number of recoveries Brice Goglin
2007-06-03 15:55 ` Jeff Garzik [this message]
2007-06-04 17:07 ` [PATCH 0/1] " Brice Goglin
2007-06-04 17:08 ` [PATCH 1/1] " Brice Goglin
2007-05-30 19:15 ` [PATCH 3/3] myri10ge: update driver version Brice Goglin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4662E481.4030508@garzik.org \
--to=jeff@garzik.org \
--cc=brice@myri.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.