From mboxrd@z Thu Jan  1 00:00:00 1970
From: Paul Jackson <pj@sgi.com>
Date: Tue, 29 Jun 2004 08:37:52 +0000
Subject: Re: How to notify app of changed cpu/mem/io node configuration?
Message-Id: <20040629013752.3b519b9d.pj@sgi.com>
List-Id: <linux-hotplug.vger.kernel.org>
References: <20040628173808.04718b83.pj@sgi.com>
In-Reply-To: <20040628173808.04718b83.pj@sgi.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
To: linux-hotplug@vger.kernel.org

Greg KH wrote:
> Through /sbin/hotplug.  Is that not acceptable?

Hotplug is a fine tool.  Not what I had in mind however.

I'm afraid my initial question was quite incomplete.

And now that I see your initial response, perhaps off-topic for this
list as well - though not that far off.

My question is not how to translate kernel hardware config events into
executing selected user code (which is what I understand hotplug is good
at), but rather how to cause a long running big scientific or technical
job (think a 4 day Fortran program) to execute a bit of code that alls
mbind(2) and/or set_mempolicy(2), at the request of a user level system
service (think batch manager) 

Consider the following scenario:

  1) A very large system - hundreds of CPUs.
  2) Running a handful of long running jobs, perhaps days.
  3) Each job is carefully placed, to use certain CPUs and Memory.
  4) Shaving a few percent off the total runtime of a job can be critical.
  5) A batch manager controls the start, stop and placement of jobs.
  6) The batch manager learns, perhaps via hotplug, that a node is doing down.
  4) The batch manager now wants to move the job that was using that node.

We have enough in current 2.6 kernels to move the CPU placement, as the
sched_setaffinity(2) can be applied to other task pid's.  However the
mbind(2) and set_mempolicy(2) calls only apply to the current task, for
good reason - too hard otherwise.  So the batch manager needs to poke a
seldom used bit of co-operating code buried in the job to be moved, to
tell it to issue the requisite mbind and set_mempolicy calls to move on
over to the new CPU and Memory resources.

=> The critical step that I am pondering here is:

  How does this event first enter the process space of the long running job?

Certainly, the job isn't hanging out on some D-BUS just waiting for it. 
It is too busy computing some large problem.

So, I need a low bandwidth, uni-directional, asynchronous channel that
has essentially zero cost on the receiver, until such time as it is
used.

I suspect that I am describing signals and only signals.

The only other option I'm aware of would be adding kernel code that,
similar in implementation to signals and debugging (ptrace) events,
provides hooks to enable one task to cause another to invoke from within
the kernel the mbind and set_mempolicy system calls on itself.

I suspect that this alternative (doing it in the kernel) would go over
on lkml like a lead balloon.  As well it should.

> I'm sure there will be a DBUS event generated from it :)

These messages make sense between co-operating threads that are waiting
for messages.  But I would guess (correct me if I'm wrong) they don't
provide async notification to unsuspecting tasks that were 100% busy
doing something entirely unrelated.  D-Bus sounds to me like another RPC
or Corba IPC mechanism, tailored for sensible use between desktop apps
and services.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel