public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Failover Kernel
@ 2009-02-26  8:58 Tarkan Erimer
  2009-02-26 16:03 ` Willy Tarreau
  2009-02-26 17:02 ` Diego Calleja
  0 siblings, 2 replies; 11+ messages in thread
From: Tarkan Erimer @ 2009-02-26  8:58 UTC (permalink / raw)
  To: linux-kernel

Hi all,

I'm thinking about a kernel feature called "Failover Kernel". The basic 
idea is to put 2 kernels (One is running "Primary Kernel" and the next 
one is "Backup Kernel") into the memory for disaster recovery of kernel 
panic'ing/crashing.

This feature's working schema could be like this :

- "Backup Kernel" could be stated and loaded into the memory via a boot 
line option like : "failover_kernel=/boot/vmlinuz-2.6.26"
- Primary running kernel will send keepalives to the "Backup Kernel" to 
state that it's alive.
- Primary running kernel can write a journal (like the journaled 
filesystems.) about needed infos for the backup kernel to recover.
- When the primary kernel crashed and couldn't send anymore keepalives, 
the backup kernel will recover from this journal to proceed to where the 
primary kernel left and will become primary.
- When "Backup Kernel" became "Primary" it will load the previous one as 
"Backup Kernel" again or maybe it could be left to manual. User could 
decide after the disaster recovery which kernel will be load as backup 
via a utility like "kexec".
- At kernel compile time, user can choose the the timing for failover 
kernel. For example, "Recover After 10 MS. of inactivity (not receiving 
keepalives). "


The usage scenarios of this feature could be :

- For people whose Datacenter is remote, it's a big problem when you 
compiled a new kernel and rebooting into a crashing/non-booting new 
kernel. You left with a completely crashed and non-functioning system. 
Hard reset and manual action is required. If there could be "Failover 
Kernel feature, the system will simply switch back to the "Backup 
Kernel" (This backup kernel will be the known stable kernel of the 
system.) and the system will proceed to work without any manual action 
required.

- Your system runs fine for the last several months and one day you hit 
a bug and kernel crashed/panic'ed . With "Failover Kernel", the system 
will switch to the "Backup Kernel" quickly (maybe some milliseconds or 
few seconds.) to recover and the system could proceed to work normally.

So,I'm not a coder and I don't know it is really possible as technically 
or not. You the kernel hackers, what's your opinion about it ? Could it 
be really possible ? If so, how we really can implement it ?

Many thanks for reading this long (and maybe stupid) post! :-)

Tarkan ERIMER



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-03-09 12:35 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-26  8:58 Failover Kernel Tarkan Erimer
2009-02-26 16:03 ` Willy Tarreau
2009-02-27 15:25   ` Tarkan Erimer
2009-02-26 17:02 ` Diego Calleja
2009-02-27 15:32   ` Tarkan Erimer
2009-02-27 15:50     ` Lubomir Rintel
2009-03-02 16:21       ` Tarkan Erimer
2009-03-03  3:29         ` David Newall
2009-03-04  8:29           ` Tarkan Erimer
2009-03-06  1:10             ` david
2009-03-09 12:35               ` Tarkan Erimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox