public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: Distributed Linux
@ 2002-11-13  5:13 Aneesh Kumar K.V
  2002-11-13 17:47 ` [SSI] " Bruce Walker
  0 siblings, 1 reply; 5+ messages in thread
From: Aneesh Kumar K.V @ 2002-11-13  5:13 UTC (permalink / raw)
  To: prasad_s; +Cc: linux-kernel, ssic-linux-devel

> As a graduation project i intended to make linux distributed 

This is what exactly  openSSI project does. http://ssic-linux.sf.net 

>The processes would be dynamically migrated from one node to the other
>based on the selections of local process (candidate) and the remote
>node. 

In the case of SSI the process to be migrated is selected  by using
mosix algorithm. If mosix load balancer is   not enabled automatic load
balancing doesn't work. But you can use the migrate() call with "best
node" argument  so that the average load  on the machine is used to
determine which node the process should migrate. 

>The entire task along with its memory map will be migrated on to the
>other system

SSI even support  mmap across cluster. That means you can  even ask a
process  that has done a mmap of file  to migrate to another node. 

>The guest system (where the process originated) would
>however have a pseudo process running on it, which would not take much
>resources but would help in handling various signals/

SSI support cluster wide signaling. That means you can send signal to a
process running on other node( you have cluster wide PID )

It also support cluster wide message queue, DLM , cluster wide device
access and  cluster wide IP. The developers are working on cluster wide
support for  semaphore shared memory  

NOTE:  it support three architectures. x86/IA64/Alpha

-aneesh 




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [SSI] Re: Distributed Linux
  2002-11-13  5:13 Distributed Linux Aneesh Kumar K.V
@ 2002-11-13 17:47 ` Bruce Walker
  2002-11-13 19:06   ` Prasad
  0 siblings, 1 reply; 5+ messages in thread
From: Bruce Walker @ 2002-11-13 17:47 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: prasad_s, linux-kernel, ssic-linux-devel

> > As a graduation project i intended to make linux distributed 

snip
> 
> >The guest system (where the process originated) would
> >however have a pseudo process running on it, which would not take much
> >resources but would help in handling various signals/
> 
> SSI support cluster wide signaling. That means you can send signal to a
> process running on other node( you have cluster wide PID )
> 
The openSSI process model is quite different than Bproc or Mosix or
your "guest system" proposal.  In the openSSI model, there is no
pseudo or shadow process at the processes creation node;  after
a processes migrates, all its system calls are executed on the new
node and signalling to the process is done directly to the process on
the new node.  Besides the obvious performance advantages this can
give, it can also provide availability advantages because the 
creation node can go down without taking the process down with it.

bruce


> 
> -aneesh 
> 
> 
> 
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by: 
> To learn the basics of securing your web site with SSL, 
> click here to get a FREE TRIAL of a Thawte Server Certificate: 
> http://www.gothawte.com/rd522.html
> _______________________________________________
> ssic-linux-devel mailing list
> ssic-linux-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/ssic-linux-devel


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [SSI] Re: Distributed Linux
  2002-11-13 17:47 ` [SSI] " Bruce Walker
@ 2002-11-13 19:06   ` Prasad
  2002-11-13 19:14     ` Prasad
  2002-11-13 22:58     ` Brian J. Watson
  0 siblings, 2 replies; 5+ messages in thread
From: Prasad @ 2002-11-13 19:06 UTC (permalink / raw)
  To: Bruce Walker; +Cc: Aneesh Kumar K.V, linux-kernel, ssic-linux-devel


On Wed, 13 Nov 2002, Bruce Walker wrote:

> > > As a graduation project i intended to make linux distributed 
> 
> snip
> > 
> > >The guest system (where the process originated) would
> > >however have a pseudo process running on it, which would not take much
> > >resources but would help in handling various signals/
> > 
> > SSI support cluster wide signaling. That means you can send signal to a
> > process running on other node( you have cluster wide PID )
> > 
> The openSSI process model is quite different than Bproc or Mosix or
> your "guest system" proposal.  In the openSSI model, there is no
> pseudo or shadow process at the processes creation node;  after
> a processes migrates, all its system calls are executed on the new
> node and signalling to the process is done directly to the process on
> the new node.  Besides the obvious performance advantages this can
> give, it can also provide availability advantages because the 
> creation node can go down without taking the process down with it.
> 

Yeah, openSSI approach has some advantages, but how about the other side,
how are the devices and files being handled?  isn't it wrong to run
someone elses process when the data that he is supposed to provide is
missing?  My work is based on a workstation model where all the nodes are
independent workstations (in most cases with similar configurations, as in
a computer laboratory at a university).  One of my major constraints is
that the system should be binary compatible with the kernel that does not
support my model. In my case i plan packing and restarting a process when
the creation node goes down.

Prasad.

> bruce
> 
> > -aneesh 
> > 

-- 
Failure is not an option


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [SSI] Re: Distributed Linux
  2002-11-13 19:06   ` Prasad
@ 2002-11-13 19:14     ` Prasad
  2002-11-13 22:58     ` Brian J. Watson
  1 sibling, 0 replies; 5+ messages in thread
From: Prasad @ 2002-11-13 19:14 UTC (permalink / raw)
  To: Bruce Walker; +Cc: Aneesh Kumar K.V, linux-kernel, ssic-linux-devel


> 
> Yeah, openSSI approach has some advantages, but how about the other side,
> how are the devices and files being handled?  isn't it wrong to run
> someone elses process when the data that he is supposed to provide is
> missing?  My work is based on a workstation model where all the nodes are
> independent workstations (in most cases with similar configurations, as in
> a computer laboratory at a university).  One of my major constraints is
> that the system should be binary compatible with the kernel that does not
> support my model. In my case i plan packing and restarting a process when
> the creation node goes down.
> 
> Prasad.
> 

Missed something in my previous one... even i am migrating only part of 
the system mode computations on the creation node. They only include the 
device/filesystem handling syscalls.  Most of the other things, that 
correspond to the process and memory management are being executed on the 
host system itself.

Prasad.

-- 
Failure is not an option


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [SSI] Re: Distributed Linux
  2002-11-13 19:06   ` Prasad
  2002-11-13 19:14     ` Prasad
@ 2002-11-13 22:58     ` Brian J. Watson
  1 sibling, 0 replies; 5+ messages in thread
From: Brian J. Watson @ 2002-11-13 22:58 UTC (permalink / raw)
  To: Prasad; +Cc: Bruce Walker, Aneesh Kumar K.V, linux-kernel, ssic-linux-devel

> Yeah, openSSI approach has some advantages, but how about the other side,
> how are the devices and files being handled?

The file systems are shared across the cluster. A mount done on one node
is done on all nodes, and every node has coherent read/write access to
that file system. This can be done in one of three ways: CFS, GFS, and
Lustre. CFS is a stateful NFS with tight coherency guarantees that
allows the internal disk of one node to be shared with all nodes. GFS is
a parallel physical file system that allows virtually simultaneous
access to a shared disk that is connected to all nodes. I don't know
much about Lustre, so someone else can fill you in on this. Only CFS and
GFS can be used for the root file system.

Devices are handled by function shipping the file ops. When a process
migrates onto a new node, it "reopens" all of its file descriptors. For
regular files, it essentially opens the files again on the new node
(leveraging the shared file systems described above). For all other
files (devices, sockets, pipes, etc.), it sets up a dummy file structure
with special ops that function ship reads, writes, ioctls, polls, etc.
to the node where a particular object lives.

> isn't it wrong to run
> someone elses process when the data that he is supposed to provide is
> missing?

As I described above, the data is available anywhere in the cluster.

> One of my major constraints is
> that the system should be binary compatible with the kernel that does not
> support my model.

That's a constraint of our clustering technology, as well. Our stuff is
installed by replacing the kernel and a few key commands that have been
made cluster aware: init, mkinitrd, lilo, mount, swapon, fsck, and maybe
one or two others I can't remember. Everything else in the OS is
blissfully unaware of the modified kernel underneath. A process running
an unmodified program can be migrated around the cluster without any
problems (apart from potential performance issues if it's doing a lot of
work with remote objects).

-- 
Brian Watson                | "Now I don't know, but I been told it's
Software Developer          |  hard to run with the weight of gold,
Open SSI Clustering Project |  Other hand I heard it said, it's
Hewlett-Packard Company     |  just as hard with the weight of lead."
                            |     -Robert Hunter, 1970

mailto:Brian.J.Watson@hp.com
http://opensource.compaq.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-11-13 22:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-13  5:13 Distributed Linux Aneesh Kumar K.V
2002-11-13 17:47 ` [SSI] " Bruce Walker
2002-11-13 19:06   ` Prasad
2002-11-13 19:14     ` Prasad
2002-11-13 22:58     ` Brian J. Watson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox