RE: Linux is not reliable enough?

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* RE: Linux is not reliable enough?
@ 2004-07-27 14:41 Wells, Charles
  2004-07-27 15:20 ` Mark Chambers
  0 siblings, 1 reply; 17+ messages in thread
From: Wells, Charles @ 2004-07-27 14:41 UTC (permalink / raw)
  To: 'Mark Chambers'; +Cc: linuxppc-embedded

Mark,

A couple of comments on your comments (sorry for keeping this going).

> One point I was trying to make is that assuming the underlying hardware
> is good, all software is theoretically perfect.

I can't imagine this statement being true.  It's true that if the hardware
is bad, the software may not operate correctly, but the converse isn't true.
The following code is incorrect, regardless of the state of the hardware it
runs on:

  int a[100], b = 123;
  a[b] = 0;

I guess I'm taking exception to your use of the phrase "all software".

> That is, given the same set of input conditions it will always produce
> the same output.

If ...

0. Asynchronous interrupts are enabled, or
1. Your code reads an A/D converter and acts on that data, or
2. Your code acts on operator input, or
3. One of several other normal situations hold,

then this statement, while true, just doesn't apply. In my experience,
real-world situations that allow the assumption of software determinism
are remarkably rare.

Ultimately what were talking about here is: who has to be convinced of
the reliability of the chosen OS?  I personally spent many years
designing and deploying hospital-grade medical monitors.  If human
life is at stake, there are regulatory agencies looking over your shoulder.

In the medical business, there is our own FDA as well as a number of other
agencies (including the German TUV (IMHO the toughest taskmaster of
them all)).  You simply aren't going to sell your device until you get
approval from the appropriate regulatory agency. It is the regulatory
agencies you need to convince.

What the agencies are looking for in your submission for approval to
sell your device is extensive test data that your company is willing
assert is accurate and that demonstrates this reliability.  This is
a huge task.  So, what you do is "pass the buck."  You find a vendor
of a commercial OS that already has done this testing and you include
their test data (and their assertions) in your submission to the
regulatory agencies.

I suppose I've wandered a bit off-topic here, but it seemed relevant.

Regards,
Charlie

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-27 14:41 Linux is not reliable enough? Wells, Charles
@ 2004-07-27 15:20 ` Mark Chambers
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Chambers @ 2004-07-27 15:20 UTC (permalink / raw)
  To: Wells, Charles; +Cc: linuxppc-embedded


Hah! I'm breaking my promise!  See below.

> Mark,
>
> A couple of comments on your comments (sorry for keeping this going).
>
> > One point I was trying to make is that assuming the underlying hardware
> > is good, all software is theoretically perfect.
>
> I can't imagine this statement being true.  It's true that if the hardware
> is bad, the software may not operate correctly, but the converse isn't
true.
> The following code is incorrect, regardless of the state of the hardware
it
> runs on:
>
>   int a[100], b = 123;
>   a[b] = 0;
>
> I guess I'm taking exception to your use of the phrase "all software".
>
>

What I mean is, if &a = 0x10000, then a[b] will always write 0 to 0x101ec.
That may not be smart, may not be what you intended to do, but the uP will
always do the exact same thing.  (Does this mean 'C' is unreliable because
it lets you do things like that?)

> > That is, given the same set of input conditions it will always produce
> > the same output.
>
> If ...
>
> 0. Asynchronous interrupts are enabled, or
> 1. Your code reads an A/D converter and acts on that data, or
> 2. Your code acts on operator input, or
> 3. One of several other normal situations hold,
>
> then this statement, while true, just doesn't apply. In my experience,
> real-world situations that allow the assumption of software determinism
> are remarkably rare.
>

Yes, but they are different input conditions then.

>
> Ultimately what were talking about here is: who has to be convinced of
> the reliability of the chosen OS?  I personally spent many years
> designing and deploying hospital-grade medical monitors.  If human
> life is at stake, there are regulatory agencies looking over your
shoulder.
>
> In the medical business, there is our own FDA as well as a number of other
> agencies (including the German TUV (IMHO the toughest taskmaster of
> them all)).  You simply aren't going to sell your device until you get
> approval from the appropriate regulatory agency. It is the regulatory
> agencies you need to convince.
>
> What the agencies are looking for in your submission for approval to
> sell your device is extensive test data that your company is willing
> assert is accurate and that demonstrates this reliability.  This is
> a huge task.  So, what you do is "pass the buck."  You find a vendor
> of a commercial OS that already has done this testing and you include
> their test data (and their assertions) in your submission to the
> regulatory agencies.
>
>

I agree completely, and I think you're making my point, that some sort of
instrinsic reliability isn't the real issue, rather what tools you need to
get the job done.

> I suppose I've wandered a bit off-topic here, but it seemed relevant.
>

Sure, it's fun, and we could go round and round until we get real jobs :-)

> Regards,
> Charlie
>
>


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: Linux is not reliable enough?
@ 2004-07-27 15:59 Mészáros Lajos
  2004-07-27 17:10 ` Oliver Korpilla
  0 siblings, 1 reply; 17+ messages in thread
From: Mészáros Lajos @ 2004-07-27 15:59 UTC (permalink / raw)
  To: Mark Chambers, Wells, Charles; +Cc: linuxppc-embedded


> >   int a[100], b = 123;
> >   a[b] = 0;
> >
> > I guess I'm taking exception to your use of the phrase "all
> software".
> >
> >
>
> What I mean is, if &a = 0x10000, then a[b] will always write
> 0 to 0x101ec.
> That may not be smart, may not be what you intended to do,
> but the uP will
> always do the exact same thing.  (Does this mean 'C' is
> unreliable because
> it lets you do things like that?)

Yes, 'C' is unreliable because writing beyond the
"maxindex" lets you overwrite other's data, other's
code and DOES make backdoor for viruses.

On the other hand testing every index every time
for min and max slowes the executing.

So what?

    Ludwig

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-27 15:59 Mészáros Lajos
@ 2004-07-27 17:10 ` Oliver Korpilla
  2004-07-27 23:08   ` Conn Clark
  0 siblings, 1 reply; 17+ messages in thread
From: Oliver Korpilla @ 2004-07-27 17:10 UTC (permalink / raw)
  To: Mészáros Lajos; +Cc: Mark Chambers, Wells, Charles, linuxppc-embedded

Mészáros Lajos wrote:
>
>Yes, 'C' is unreliable because writing beyond the "maxindex" lets
>you overwrite other's data, other's code and DOES make backdoor for
>viruses.
>
>On the other hand testing every index every time for min and max slowes
>the executing.

QNX does not, and Linux does not, and with both C is as unreliable as ever.

However, a failure in a QNX in the driver level is not as potentially
malicious as in Linux. While this does not exclude failure, and does not
say a thing about the actual quality of QNX or Linux code, it's a nice
_additional_ feature related towards stability.

I guess Linux lacking proper certification for some applications is a
much bigger obstacle in the minds of managers, anyway.

But somehow this is getting offtopic, quickly, isn't it?

With kind regards,
Oliver Korpilla

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-27 17:10 ` Oliver Korpilla
@ 2004-07-27 23:08   ` Conn Clark
  0 siblings, 0 replies; 17+ messages in thread
From: Conn Clark @ 2004-07-27 23:08 UTC (permalink / raw)
  To: okorpil
  Cc: Mészáros Lajos, Mark Chambers, Wells, Charles,
	linuxppc-embedded


Oliver Korpilla wrote:
>
> Mészáros Lajos wrote:
>
>>
>> Yes, 'C' is unreliable because writing beyond the "maxindex" lets
>> you overwrite other's data, other's code and DOES make backdoor for
>> viruses.
>>
>> On the other hand testing every index every time for min and max slowes
>> the executing.
>
>
> QNX does not, and Linux does not, and with both C is as unreliable as ever.
>
> However, a failure in a QNX in the driver level is not as potentially
> malicious as in Linux. While this does not exclude failure, and does not
> say a thing about the actual quality of QNX or Linux code, it's a nice
> _additional_ feature related towards stability.
>
> I guess Linux lacking proper certification for some applications is a
> much bigger obstacle in the minds of managers, anyway.
>
> But somehow this is getting offtopic, quickly, isn't it?
>
> With kind regards,
> Oliver Korpilla
>

	If C is not reliable enough then nothing is. In assembly(or even
handcoded machine language for the real hard core people) you can do
just about anything. Since all languages must resort to this at some
point, our foundation is built on sand and we are eternaly screwed.

	Only the 4 last words of the last sentence of this statement is false.
Its not the languages job to write good solid code or verify it, its the
programmers responsibility. The only code you can truly trust is code
you have total control over. For truly solid code, no funky error
checking computer science language will ever replace good practices,
thorough testing, and documentation.

	The reason for this is obvious, at some point higher level languages
must be built using a lower language to avoid the chicken/egg paradox.
Hence the quality of a higher level language can never be assured since
its foundation is made of sand.

Its a poor musician that blames their instrument.

-- Conn Clark

*****************************************************************
Give a man a match and you heat him for a moment. Set him on fire
and you'll heat him for life.
*****************************************************************

Conn Clark
Engineering Stooge				clark@esteem.com
Electronic Systems Technology Inc.		www.esteem.com


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: random ramblings on 8xx patches (long and tedious :-)
@ 2004-07-23 17:22 Wolfgang Denk
  2004-07-23 21:06 ` Linux is not reliable enough? Kevin P. Dankwardt
  0 siblings, 1 reply; 17+ messages in thread
From: Wolfgang Denk @ 2004-07-23 17:22 UTC (permalink / raw)
  To: Pantelis Antoniou; +Cc: Robert P. J. Day, Embedded Linux PPC list

In message <410123EE.4000602@intracom.gr> you wrote:
>
> IMHO we shouldn't even bother.

oops???

> Then everything should be automatically set up at run-time, based
> on probing code which should detect the rest.

You must be joking. This is for embedded systems, and code  size  and
especially boot time are critical.

Also, this suggestion does not coder Robert's intention of preventint
the user from selecting bogus configuration options for  stuff  which
doesn't exist on his chip.

Best regards,

Wolfgang Denk

--
Software Engineering:  Embedded and Realtime Systems,  Embedded Linux
Phone: (+49)-8142-4596-87  Fax: (+49)-8142-4596-88  Email: wd@denx.de
"Life sucks, but it's better than the alternative."
- Peter da Silva

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Linux is not reliable enough?
  2004-07-23 17:22 random ramblings on 8xx patches (long and tedious :-) Wolfgang Denk
@ 2004-07-23 21:06 ` Kevin P. Dankwardt
  2004-07-24  3:02   ` Linh Dang
                     ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Kevin P. Dankwardt @ 2004-07-23 21:06 UTC (permalink / raw)
  To: Embedded Linux PPC list

I am working with a team on a project where their customer is concerned
about the reliability of Linux. The customer wants to go with QNX because of
the belief that QNX Neutrino is inherently more reliable. This belief
revolves around the differences in design where drivers in QNX do not reside
in the same address space as the (micro-)kernel.

What the team was hoping to use is a MPC5200 based system and the ELDK.

The team needs to specifically address their customer's concern that a
single driver can crash the operating system in Linux, since the driver
resides in the same memory space as the kernel. They need to present
convincing arguments to the customer's Chief Software Architect.

Does anyone know of any good resources/references to address these concerns?
Any evidence, either way, that QNX Neutrino is more reliable?

Will the ELDK be adopting any of the Carrier Grade Linux requirements for
reliability? Any other projects like this of note?

Does anyone know of any embedded Linux projects where human lives really do
depend upon Linux to be robust and reliable?

Is UserMode Linux a possibility? Can one create custom drivers for UML and
mitigate risks that way?

Any clever ideas? Any clever, actually tested, ideas?

Despite all of the hype, would any of you be willing to look a customer in
the eye and say that an embedded Linux system can be reliable enough for
human lives to depend on it?

Thanks,
Kevin Dankwardt

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-23 21:06 ` Linux is not reliable enough? Kevin P. Dankwardt
@ 2004-07-24  3:02   ` Linh Dang
  2004-07-24  6:29     ` Der Herr Hofrat
  2004-07-25 16:23     ` Wolfgang Denk
  2004-07-24 11:35   ` Mark Chambers
  2004-07-24 21:44   ` Sylvain Munaut
  2 siblings, 2 replies; 17+ messages in thread
From: Linh Dang @ 2004-07-24  3:02 UTC (permalink / raw)
  To: Embedded Linux PPC list

Well, X server doesn't run in the same vm space as the
kernel. however, on my PC, X is the thing that would most likely to
put the system into an unusable state (frozen keyboard, mouse and
screen).

QNX's drivers run in their own process space but they still have
direct access to the hardware. A driver can screw up the hw regardless
whatever vm-sandbox in which it run.

my 0.02$

--
Linh Dang

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-24  3:02   ` Linh Dang
@ 2004-07-24  6:29     ` Der Herr Hofrat
  2004-07-25 16:23     ` Wolfgang Denk
  1 sibling, 0 replies; 17+ messages in thread
From: Der Herr Hofrat @ 2004-07-24  6:29 UTC (permalink / raw)
  To: Linh Dang; +Cc: Embedded Linux PPC list

>
> Well, X server doesn't run in the same vm space as the
> kernel. however, on my PC, X is the thing that would most likely to
> put the system into an unusable state (frozen keyboard, mouse and
> screen).

in prinzipal that is the way it should be but this is not quite true
as X utilizes user-space (SUID-ROOT) drivers quite a lot - so even though
X is in its on VM it is fiddling with low level resources that are normally
under kernel control (i.e. sti/cli). So this is actually the
same problem in Linux as you noted for QNX.

hofrat

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-24  3:02   ` Linh Dang
  2004-07-24  6:29     ` Der Herr Hofrat
@ 2004-07-25 16:23     ` Wolfgang Denk
  1 sibling, 0 replies; 17+ messages in thread
From: Wolfgang Denk @ 2004-07-25 16:23 UTC (permalink / raw)
  To: Linh Dang; +Cc: Embedded Linux PPC list


In message <wn5pt6mw1jc.fsf@linhd-2.ca.nortel.com> you wrote:
>
> Well, X server doesn't run in the same vm space as the
> kernel. however, on my PC, X is the thing that would most likely to
> put the system into an unusable state (frozen keyboard, mouse and
> screen).

This just means that X is crashed. I'm  pretty  sure  the  system  is
stillr  unning  fine, and you can for example login over the network.
This example has lttle relevance to embedded system.

> QNX's drivers run in their own process space but they still have
> direct access to the hardware. A driver can screw up the hw regardless
> whatever vm-sandbox in which it run.

Indeed.

Best regards,

Wolfgang Denk

--
Software Engineering:  Embedded and Realtime Systems,  Embedded Linux
Phone: (+49)-8142-4596-87  Fax: (+49)-8142-4596-88  Email: wd@denx.de
Dear Lord: I just want *one* one-armed manager so  I  never  have  to
hear "On the other hand", again.

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-23 21:06 ` Linux is not reliable enough? Kevin P. Dankwardt
  2004-07-24  3:02   ` Linh Dang
@ 2004-07-24 11:35   ` Mark Chambers
  2004-07-26  7:49     ` Marius Groeger
  2004-07-24 21:44   ` Sylvain Munaut
  2 siblings, 1 reply; 17+ messages in thread
From: Mark Chambers @ 2004-07-24 11:35 UTC (permalink / raw)
  To: Kevin P. Dankwardt, Embedded Linux PPC list


Kevin,

I suspect you have a political problem here.  Mr. Chief Software Architect
(which is a title for someone who doesn't actually *do* anything) is not
going to lose his job for choosing QNX.  My suggestion is this:  Point out
that the only way to prove reliability is with testing.  Linux is open
source, it won't cost anything to put it on a side by side test, and let
Linux speak for itself.

- Linux is open source, any potential bugs are theoretically fixable.  What
do you do if QNX develops a problem?
- You can hunt around for some of Linus's comments about microkernal
architecture.  He thinks they're stupid, only he's says it more poetically.
- Don't fall into this trap of software mysticism, that one operating system
is somehow intrinsically more reliable than another.  There's good and bad
software, to be sure, but even Windows can be reliable in certain carefully
constrained environments.  It's only ones and zeros.

My $.02

Mark Chambers

----- Original Message -----
From: "Kevin P. Dankwardt" <k@kcomputing.com>
To: "Embedded Linux PPC list" <linuxppc-embedded@lists.linuxppc.org>
Sent: Friday, July 23, 2004 5:06 PM
Subject: Linux is not reliable enough?


>
> I am working with a team on a project where their customer is concerned
> about the reliability of Linux. The customer wants to go with QNX because
of
> the belief that QNX Neutrino is inherently more reliable. This belief
> revolves around the differences in design where drivers in QNX do not
reside
> in the same address space as the (micro-)kernel.
>
> What the team was hoping to use is a MPC5200 based system and the ELDK.
>
<snip>


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-24 11:35   ` Mark Chambers
@ 2004-07-26  7:49     ` Marius Groeger
  2004-07-26 13:46       ` Mark Chambers
  0 siblings, 1 reply; 17+ messages in thread
From: Marius Groeger @ 2004-07-26  7:49 UTC (permalink / raw)
  To: Mark Chambers; +Cc: Kevin P. Dankwardt, Embedded Linux PPC list

On Sat, 24 Jul 2004, Mark Chambers wrote:

> that the only way to prove reliability is with testing.  Linux is open
> source, it won't cost anything to put it on a side by side test, and let
> Linux speak for itself.

Getting to the point where you can run this side by side test *will*
cost money, and typically rather much, what's more. It is not likely
that Kevin's customer is going to pay the implementation for two OSes,
even if it is only to the prototype stage.

So, thinking about the right OS for the job in advance, as they do, is
a good idea. Only the thinking must be done right, of course :-)

Regards,
Marius

--
Marius Groeger <mgroeger@sysgo.com>
SYSGO AG                      Embedded and Real-Time Software
Voice: +49 6136 9948 0                  FAX: +49 6136 9948 10
www.sysgo.com | www.elinos.com | www.osek.de | www.imerva.com

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-26  7:49     ` Marius Groeger
@ 2004-07-26 13:46       ` Mark Chambers
  2004-07-26 14:31         ` Der Herr Hofrat
                           ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Mark Chambers @ 2004-07-26 13:46 UTC (permalink / raw)
  To: Marius Groeger; +Cc: linuxppc-embedded

> On Sat, 24 Jul 2004, Mark Chambers wrote:
>
> > that the only way to prove reliability is with testing.  Linux is open
> > source, it won't cost anything to put it on a side by side test, and let
> > Linux speak for itself.
>
> Getting to the point where you can run this side by side test *will*
> cost money, and typically rather much, what's more. It is not likely
> that Kevin's customer is going to pay the implementation for two OSes,
> even if it is only to the prototype stage.
>

Yes, a good point.  But I'm speaking with a salesman voice.  For someone who
is an expert like Kevin he can no doubt prototype something fairly quickly,
and getting the customer to see something actually working is very powerful.
It puts the ball in the Chief Software Architect's (the CSA, hereafter :-)
court to justify the additional expense of QNX.

> So, thinking about the right OS for the job in advance, as they do, is
> a good idea. Only the thinking must be done right, of course :-)
>

Indeed.  I guess I should spell out what I think is wrong with the CSAs
apparent thinking:  He points out an aspect of linux, namely that drivers
can crash the system, as an issue that somehow makes linux intrinsically
unreliable.  But if you write drivers that don't crash the system then linux
is not unreliable.  The only operating system that doesn't allow a clever
programmer to crash is one that doesn't do anything.  Microkernels, they
say, allow you to do nifty things like replace the file system without
rebooting.  So that means you could swap in a buggy filesystem and destroy
the data on your disc/flash.  Without rebooting.  Which is good since you
won't be able to boot from your corrupted filesystem, which won't show up
until the next power failure, while the poor nurse with a flashlight talks
to a guy on the phone who assures her QNX can't fail.  So every OS, and
every feature, has its pro's and con's.  The question for any CSA is not 'is
this reliable' but 'can I make a reliable system using this component'?
Will the OS eat itself, or do I only have to worry about the mistakes I
make?  A carefully constructed linux system should be good for 5 or even 6
nines of reliability.

Mark Chambers

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-26 13:46       ` Mark Chambers
@ 2004-07-26 14:31         ` Der Herr Hofrat
  2004-07-26 15:42         ` Marius Groeger
  2004-07-27 11:20         ` Robert Kaiser
  2 siblings, 0 replies; 17+ messages in thread
From: Der Herr Hofrat @ 2004-07-26 14:31 UTC (permalink / raw)
  To: Mark Chambers; +Cc: Marius Groeger, linuxppc-embedded

>
> > On Sat, 24 Jul 2004, Mark Chambers wrote:
> >
> > > that the only way to prove reliability is with testing.  Linux is open
> > > source, it won't cost anything to put it on a side by side test, and let
> > > Linux speak for itself.
> >
> > Getting to the point where you can run this side by side test *will*
> > cost money, and typically rather much, what's more. It is not likely
> > that Kevin's customer is going to pay the implementation for two OSes,
> > even if it is only to the prototype stage.
> >
>
> Yes, a good point.  But I'm speaking with a salesman voice.  For someone who
> is an expert like Kevin he can no doubt prototype something fairly quickly,
> and getting the customer to see something actually working is very powerful.
> It puts the ball in the Chief Software Architect's (the CSA, hereafter :-)
> court to justify the additional expense of QNX.
>

prototyping and testing only can proof things if you
can reliably reproduce the rare failure cases - which limits
this posibility seriously. I guess nobody doubts
that Linux is stable under typical load situations (what
ever those may be..)

> > So, thinking about the right OS for the job in advance, as they do, is
> > a good idea. Only the thinking must be done right, of course :-)
> >
>
> Indeed.  I guess I should spell out what I think is wrong with the CSAs
> apparent thinking:  He points out an aspect of linux, namely that drivers
> can crash the system, as an issue that somehow makes linux intrinsically
> unreliable.  But if you write drivers that don't crash the system then linux
> is not unreliable.  The only operating system that doesn't allow a clever
> programmer to crash is one that doesn't do anything.  Microkernels, they
> say, allow you to do nifty things like replace the file system without
> rebooting.  So that means you could swap in a buggy filesystem and destroy
> the data on your disc/flash.  Without rebooting.  Which is good since you
> won't be able to boot from your corrupted filesystem, which won't show up
> until the next power failure, while the poor nurse with a flashlight talks
> to a guy on the phone who assures her QNX can't fail.  So every OS, and
> every feature, has its pro's and con's.  The question for any CSA is not 'is
> this reliable' but 'can I make a reliable system using this component'?
> Will the OS eat itself, or do I only have to worry about the mistakes I
> make?  A carefully constructed linux system should be good for 5 or even 6
> nines of reliability.

The issue is more the presenting of convincing
safty cases - and in that area QNX most likely has a easier game than
embedded Linux - not because Linux has less potential but because
it does not have the trak record for safty critical apps (yet)
And with the development speed of the Linux kernel a real
evaluation of the kernel is a non-trivial task - in that respect a
microkernel does have serious advantages if one can isolate components
, that is gurarty error-containment within a component, in a way that
composability is maintained even in the error case - this definitly will
be hard to do for Linux and most likely for QNX core components a lot of
this work has allready been done.

to summarize the problem - a quote from Rich Cook:

Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs, and the universe
striving to produce bigger and better idiots. So far, the universe is
winning.

hofrat

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-26 13:46       ` Mark Chambers
  2004-07-26 14:31         ` Der Herr Hofrat
@ 2004-07-26 15:42         ` Marius Groeger
  2004-07-27 11:20         ` Robert Kaiser
  2 siblings, 0 replies; 17+ messages in thread
From: Marius Groeger @ 2004-07-26 15:42 UTC (permalink / raw)
  To: Mark Chambers; +Cc: linuxppc-embedded

On Mon, 26 Jul 2004, Mark Chambers wrote:

> to a guy on the phone who assures her QNX can't fail.  So every OS, and
> every feature, has its pro's and con's.  The question for any CSA is not 'is
> this reliable' but 'can I make a reliable system using this component'?

I agree: reliability is very strongly connected to the actual
components being used, and the overall system design.

One point pro Microkernel approaches, and a one that the CSA may have
been after: they allow you to decouple things. People advocating Linux
as a "solves-everything" sometimes fail to see that there are numerous
applications which want to embrace the typical strengths of Linux
(that is, the general purpose OS with a GUI, networking, POSIX shell)
for having a nice front-end, or for non-critical functions. They still
*need* to have the critical stuff running in a certified environment.

In other words: without serious work (read: big $$$), and most likely
many, many modifications and limitations that take away almost all
dynamic that open source software is known and loved for, we're not
going to see Linux in applications which require DO-178B Level A
certification.

Having said that, we may be far beyond that CSA's intentions here. But
I think this is an interesting (albeit OT) discussion, regardless.

Regards,
Marius

--
Marius Groeger <mgroeger@sysgo.com>
SYSGO AG                      Embedded and Real-Time Software
Voice: +49 6136 9948 0                  FAX: +49 6136 9948 10
www.sysgo.com | www.elinos.com | www.osek.de | www.imerva.com

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-26 13:46       ` Mark Chambers
  2004-07-26 14:31         ` Der Herr Hofrat
  2004-07-26 15:42         ` Marius Groeger
@ 2004-07-27 11:20         ` Robert Kaiser
  2004-07-27 13:29           ` Mark Chambers
  2 siblings, 1 reply; 17+ messages in thread
From: Robert Kaiser @ 2004-07-27 11:20 UTC (permalink / raw)
  To: Mark Chambers, Marius Groeger; +Cc: linuxppc-embedded

Am Montag, 26. Juli 2004 15:46 schrieb Mark Chambers:
> But if you write drivers that don't crash the system then
> linux is not unreliable.

Its not just the drivers: every line of kernel code (and there are over a
million of those in linux) has the potential to crash the system. In order to
be *really* sure that the system is reliable, one would have to give all that
code a thorough examination. Depending on how "thorough examination" is
defined (and there are approved standards for this), this effort results in
costs that quickly make the question wether the OS's source code is available
for free, or will cost a few hundred kilobucks, a non-issue.

> The only operating system that doesn't allow a
> clever programmer to crash is one that doesn't do anything.  Microkernels,
> they say, allow you to do nifty things like replace the file system without
> rebooting.

This is not really a microkernel-specific feature. I believe Linux with its
kernel modules can do this as well.

The important thing about the microkernel approach is that it allows to build
OS functionality from components, where each component runs in its own
address space and only has access to the resources it needs to do its job. A
device driver only needs access to the registers of the device it is supposed
to handle, so it can only foul up this particular device (*). If such a
driver goes haywire, it *can* not affect, e.g. other driver's hardware or
memory: the bug remains local to the software that causes it. Such a failure
affects only the offending component itself and the software modules that
rely on the services this component offers.

The benefit of this approach for safety-critical systems is that one can
identify the components that are critical to the application. If a particular
application does not require a big deal of OS functionality, then only the
few components necessary to implement it need to be scrutinized. Other
components may well exist in the system, for example to support non-critical
parts of the application, because they can not affect the critical parts.

Don't get me wrong: I'm not saying a microkernel (or even QNX) is inherently
safer than Linux. However, if done right, it can give you the freedom to
trade functional complexity against functional safety.

> So that means you could swap in a buggy filesystem and destroy
> the data on your disc/flash.  Without rebooting.  Which is good since you
> won't be able to boot from your corrupted filesystem, which won't show up
> until the next power failure, while the poor nurse with a flashlight talks
> to a guy on the phone who assures her QNX can't fail.  So every OS, and
> every feature, has its pro's and con's.  The question for any CSA is not
> 'is this reliable' but 'can I make a reliable system using this component'?
> Will the OS eat itself, or do I only have to worry about the mistakes I
> make?  A carefully constructed linux system should be good for 5 or even 6
> nines of reliability.

This may be your gut feeling, but the CSA has to *prove* it for the OS he
chooses (at least he should have to, that is his responsibility).

(Honestly: would you fly in an aircraft whose steer-by-wire system is
controlled by Linux/QNX/any other OS (name please)?)

Rob

(*) There are some more issues here which I left out for brevity: If the
device being handled by a driver is capable of DMA, it *can* crash
everything. Therefore, such drivers need special consideration. Also, for
memory-mapped I/O, access permissions to device registers need to be enforced
by the MMU, so, if there are multiple devices with their registers within the
same physical page, they can not be protected from each other. Nevertheless,
the likelihood if a wild pointer causing spurious crashes is still greatly
reduced.

----------------------------------------------------------------
Robert Kaiser                         email: rkaiser@sysgo.com
SYSGO AG
Am Pfaffenstein 14                    phone: (49) 6136 9948-762
D-55270 Klein-Winternheim / Germany   fax:   (49) 6136 9948-10

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-27 11:20         ` Robert Kaiser
@ 2004-07-27 13:29           ` Mark Chambers
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Chambers @ 2004-07-27 13:29 UTC (permalink / raw)
  To: linuxppc-embedded

Ok, I promise (mostly to myself) that this is my last comment about this.
One point I was trying to make is that assuming the underlying hardware is
good, all software is theoretically perfect.  That is, given the same set of
input conditions it will always produce the same output.  So software can
only become unreliable when applied to some real world application, when the
deterministic outputs are not the ones we wanted.  So perhaps I'm being
overly pedantic here, but I think it's relevant to our discussion, because
people may make an apples to oranges comparison when comparing linux to
other OS's.  Other OS's may provide a shrink-wrapped solution which has been
extensively tested and is relatively guaranteed for some range of inputs.
Linux, on the other hand, is more a raw material.  But because linux is open
source it can be molded to your application.  The many flavors of hard
realtime or 'carrier grade' linux are perfect examples of this.  So it comes
down to evaluating what you need to do your job, and not applying some
mystical quality of 'reliability' to a piece of software.  I do understand
what we mean by 'reliable' as a practical matter, my point is just that you
can't say whether linux is reliable or not until you know what it's required
to do.  If I wanted to create a digital alarm clock, for an extreme example,
I bet I could write a linux app that would NEVER crash. (at least until
Y4K).

By the way, it's not my 'feeling' that linux can do 5 nines.  It's been
done.  Granted, the numbers come from clusters of PCs, so 100 PCs running
without failure for a year does not necessarily translate to one PC running
for 100 years.

And finally, this is all quite relevant to the discussion about dropping 2.4
for 2.6.  It's important for linux in general that well characterized
versions of the software are available.

So as Forrest Gump would say, that's all I have to say about that.

Mark Chambers

> Its not just the drivers: every line of kernel code (and there are over a
> million of those in linux) has the potential to crash the system. In order
to
> be *really* sure that the system is reliable, one would have to give all
that
> code a thorough examination. Depending on how "thorough examination" is
> defined (and there are approved standards for this), this effort results
in
> costs that quickly make the question wether the OS's source code is
available
> for free, or will cost a few hundred kilobucks, a non-issue.
>
> > The only operating system that doesn't allow a
> > clever programmer to crash is one that doesn't do anything.
Microkernels,
> > they say, allow you to do nifty things like replace the file system
without
> > rebooting.
>
> This is not really a microkernel-specific feature. I believe Linux with
its
> kernel modules can do this as well.
>
> The important thing about the microkernel approach is that it allows to
build
> OS functionality from components, where each component runs in its own
> address space and only has access to the resources it needs to do its job.
A
> device driver only needs access to the registers of the device it is
supposed
> to handle, so it can only foul up this particular device (*). If such a
> driver goes haywire, it *can* not affect, e.g. other driver's hardware or
> memory: the bug remains local to the software that causes it. Such a
failure
> affects only the offending component itself and the software modules that
> rely on the services this component offers.
>
> The benefit of this approach for safety-critical systems is that one can
> identify the components that are critical to the application. If a
particular
> application does not require a big deal of OS functionality, then only the
> few components necessary to implement it need to be scrutinized. Other
> components may well exist in the system, for example to support
non-critical
> parts of the application, because they can not affect the critical parts.
>
> Don't get me wrong: I'm not saying a microkernel (or even QNX) is
inherently
> safer than Linux. However, if done right, it can give you the freedom to
> trade functional complexity against functional safety.
>
> > So that means you could swap in a buggy filesystem and destroy
> > the data on your disc/flash.  Without rebooting.  Which is good since
you
> > won't be able to boot from your corrupted filesystem, which won't show
up
> > until the next power failure, while the poor nurse with a flashlight
talks
> > to a guy on the phone who assures her QNX can't fail.  So every OS, and
> > every feature, has its pro's and con's.  The question for any CSA is not
> > 'is this reliable' but 'can I make a reliable system using this
component'?
> > Will the OS eat itself, or do I only have to worry about the mistakes I
> > make?  A carefully constructed linux system should be good for 5 or even
6
> > nines of reliability.
>
> This may be your gut feeling, but the CSA has to *prove* it for the OS he
> chooses (at least he should have to, that is his responsibility).
>
> (Honestly: would you fly in an aircraft whose steer-by-wire system is
> controlled by Linux/QNX/any other OS (name please)?)
>
>
> Rob
>
> (*) There are some more issues here which I left out for brevity: If the
> device being handled by a driver is capable of DMA, it *can* crash
> everything. Therefore, such drivers need special consideration. Also, for
> memory-mapped I/O, access permissions to device registers need to be
enforced
> by the MMU, so, if there are multiple devices with their registers within
the
> same physical page, they can not be protected from each other.
Nevertheless,
> the likelihood if a wild pointer causing spurious crashes is still greatly
> reduced.
>
> ----------------------------------------------------------------
> Robert Kaiser                         email: rkaiser@sysgo.com
> SYSGO AG
> Am Pfaffenstein 14                    phone: (49) 6136 9948-762
> D-55270 Klein-Winternheim / Germany   fax:   (49) 6136 9948-10
>
>
>

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Linux is not reliable enough?
  2004-07-23 21:06 ` Linux is not reliable enough? Kevin P. Dankwardt
  2004-07-24  3:02   ` Linh Dang
  2004-07-24 11:35   ` Mark Chambers
@ 2004-07-24 21:44   ` Sylvain Munaut
  2 siblings, 0 replies; 17+ messages in thread
From: Sylvain Munaut @ 2004-07-24 21:44 UTC (permalink / raw)
  To: Kevin P. Dankwardt, Linux/PPC Embedded


Kevin P. Dankwardt wrote:

 >
 > What the team was hoping to use is a MPC5200 based system and the
 > ELDK.
 >
 >
 > Does anyone know of any embedded Linux projects where human lives
 > really do depend upon Linux to be robust and reliable?

AFAIK, the MPC5200 is not to be used in life-critical systems ...
That's what the datahseet say anyway.

Sylvain


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2004-07-27 23:08 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-07-27 14:41 Linux is not reliable enough? Wells, Charles
2004-07-27 15:20 ` Mark Chambers
  -- strict thread matches above, loose matches on Subject: below --
2004-07-27 15:59 Mészáros Lajos
2004-07-27 17:10 ` Oliver Korpilla
2004-07-27 23:08   ` Conn Clark
2004-07-23 17:22 random ramblings on 8xx patches (long and tedious :-) Wolfgang Denk
2004-07-23 21:06 ` Linux is not reliable enough? Kevin P. Dankwardt
2004-07-24  3:02   ` Linh Dang
2004-07-24  6:29     ` Der Herr Hofrat
2004-07-25 16:23     ` Wolfgang Denk
2004-07-24 11:35   ` Mark Chambers
2004-07-26  7:49     ` Marius Groeger
2004-07-26 13:46       ` Mark Chambers
2004-07-26 14:31         ` Der Herr Hofrat
2004-07-26 15:42         ` Marius Groeger
2004-07-27 11:20         ` Robert Kaiser
2004-07-27 13:29           ` Mark Chambers
2004-07-24 21:44   ` Sylvain Munaut

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).