All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurent DENIEL <laurent.deniel@thalesatm.com>
To: hadi@cyberus.ca
Cc: "David S. Miller" <davem@redhat.com>,
	jgarzik@pobox.com, shmulik.hen@intel.com,
	bonding-devel@lists.sourceforge.net, netdev@oss.sgi.com
Subject: Re: [Bonding-devel] Re: [SET 2][PATCH 2/8][bonding] Propagating master'ssettings toslaves
Date: Tue, 12 Aug 2003 16:36:40 +0200	[thread overview]
Message-ID: <3F38FB78.4536593A@thalesatm.com> (raw)
In-Reply-To: 1060698412.1063.7.camel@jzny.localdomain

jamal a écrit :
> 
> On Tue, 2003-08-12 at 10:10, Laurent DENIEL wrote:
> > "David S. Miller" a écrit :
> 
> > That's why in really *safe* systems, we do not use routing daemon
> > but only static routes ;-)
> >
> > And there is a BIG difference :
> >
> > When user level daemon dies, you have to be sure that some stuff
> > exists to monitor and recover from that situation (either by
> > restarting the faulty deamon (if it could recover in time which
> > I doubt with the bonding case), or by switching to a new machine
> > in a fault tolerant configuration). With kernel ooops, there is
> > NOTHING to do in such in such a fault tolerant systems, since the
> > machine is unusable (this is the same as a hardware failure).
> >
> > But people does not understand the constraints of really safe
> > systems.
> >
> 
> We have hardware watchdog timers to put the kernel into a known state by
> rebooting. If you were not aware of all these RAS efforts on Linux
> (projects like kexec for example) I suggest you start looking at them.

I am aware of this great stuff but see below.

> The kernel will oops and the app will die because of one thing: _A
> software bug_. It doesnt matter what causes the death of the kernel or
> app ( a misconfig for example causing a broadcast loop making the app
> die is a bug).
> If you want a safe system then you donot trust software neither do you
> trust hardware - You must have workarounds incase they go beserk. Heck
> the only entity you should trust is God and thats assuming you believe
> in God.

Hardware / software watchdogs are great but do not necessarily 
solve all problems especially where timing constraints are important.
I prefer to rely on the timing of the bonding kernel code to switch
NIC in milli seconds that to wait seconds or minutes that a user space
daemon have the hand to handle the problem (and yes, I am aware of 
real time class scheduling and so on, but you say don't trust the 
software, and I agree so I prefer a direct kernel hang than nothing 
or something too late (software watchdog will not help in that case).

Laurent

  parent reply	other threads:[~2003-08-12 14:36 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-08-09 10:29 [SET 2][PATCH 2/8][bonding] Propagating master's settings to slaves Hen, Shmulik
2003-08-11  2:51 ` jamal
2003-08-11 10:08   ` Shmulik Hen
2003-08-11 13:47     ` jamal
2003-08-11 14:07       ` [Bonding-devel] Re: [SET 2][PATCH 2/8][bonding] Propagating master's settings toslaves Laurent DENIEL
2003-08-11 14:20         ` Shmulik Hen
2003-08-11 14:34           ` jamal
2003-08-11 16:25             ` Shmulik Hen
2003-08-11 16:43               ` Jeff Garzik
2003-08-11 17:31                 ` [Bonding-devel] Re: [SET 2][PATCH 2/8][bonding] Propagating master'ssettings toslaves Laurent DENIEL
2003-08-11 17:43                   ` Jeff Garzik
2003-08-12  6:31                     ` Laurent DENIEL
2003-08-12 12:59                       ` jamal
2003-08-12 13:08                         ` David S. Miller
2003-08-12 14:10                           ` Laurent DENIEL
     [not found]                             ` <1060698412.1063.7.camel@jzny.localdomain>
2003-08-12 14:36                               ` Laurent DENIEL [this message]
2003-08-12 15:05                                 ` jamal
2003-08-12  2:32                   ` jamal
2003-08-11 21:27                 ` [Bonding-devel] Re: [SET 2][PATCH 2/8][bonding] Propagating master's settings toslaves Mark Huth
2003-08-11 21:41                 ` Jay Vosburgh
2003-08-11 23:15                   ` [SET 2][PATCH 2/8][bonding] Propagating master's settings to slaves Shmulik Hen
2003-08-11 23:28                     ` [Bonding-devel] " Jay Vosburgh
2003-08-12  2:36                     ` jamal
2003-08-12  2:33                   ` [Bonding-devel] Re: [SET 2][PATCH 2/8][bonding] Propagating master's settings toslaves jamal
2003-08-12  2:31               ` jamal
     [not found] <E791C176A6139242A988ABA8B3D9B38A0251E69F@hasmsx403.iil.intel.com>
2003-08-12 14:03 ` [Bonding-devel] Re: [SET 2][PATCH 2/8][bonding] Propagating master'ssettings toslaves Shmulik Hen
2003-08-12 14:04   ` David S. Miller
2003-08-12 14:29   ` jamal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3F38FB78.4536593A@thalesatm.com \
    --to=laurent.deniel@thalesatm.com \
    --cc=bonding-devel@lists.sourceforge.net \
    --cc=davem@redhat.com \
    --cc=hadi@cyberus.ca \
    --cc=jgarzik@pobox.com \
    --cc=netdev@oss.sgi.com \
    --cc=shmulik.hen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.