From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paul Jackson Date: Tue, 29 Jun 2004 08:37:52 +0000 Subject: Re: How to notify app of changed cpu/mem/io node configuration? Message-Id: <20040629013752.3b519b9d.pj@sgi.com> List-Id: References: <20040628173808.04718b83.pj@sgi.com> In-Reply-To: <20040628173808.04718b83.pj@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: linux-hotplug@vger.kernel.org Greg KH wrote: > Through /sbin/hotplug. Is that not acceptable? Hotplug is a fine tool. Not what I had in mind however. I'm afraid my initial question was quite incomplete. And now that I see your initial response, perhaps off-topic for this list as well - though not that far off. My question is not how to translate kernel hardware config events into executing selected user code (which is what I understand hotplug is good at), but rather how to cause a long running big scientific or technical job (think a 4 day Fortran program) to execute a bit of code that alls mbind(2) and/or set_mempolicy(2), at the request of a user level system service (think batch manager) Consider the following scenario: 1) A very large system - hundreds of CPUs. 2) Running a handful of long running jobs, perhaps days. 3) Each job is carefully placed, to use certain CPUs and Memory. 4) Shaving a few percent off the total runtime of a job can be critical. 5) A batch manager controls the start, stop and placement of jobs. 6) The batch manager learns, perhaps via hotplug, that a node is doing down. 4) The batch manager now wants to move the job that was using that node. We have enough in current 2.6 kernels to move the CPU placement, as the sched_setaffinity(2) can be applied to other task pid's. However the mbind(2) and set_mempolicy(2) calls only apply to the current task, for good reason - too hard otherwise. So the batch manager needs to poke a seldom used bit of co-operating code buried in the job to be moved, to tell it to issue the requisite mbind and set_mempolicy calls to move on over to the new CPU and Memory resources. => The critical step that I am pondering here is: How does this event first enter the process space of the long running job? Certainly, the job isn't hanging out on some D-BUS just waiting for it. It is too busy computing some large problem. So, I need a low bandwidth, uni-directional, asynchronous channel that has essentially zero cost on the receiver, until such time as it is used. I suspect that I am describing signals and only signals. The only other option I'm aware of would be adding kernel code that, similar in implementation to signals and debugging (ptrace) events, provides hooks to enable one task to cause another to invoke from within the kernel the mbind and set_mempolicy system calls on itself. I suspect that this alternative (doing it in the kernel) would go over on lkml like a lead balloon. As well it should. > I'm sure there will be a DBUS event generated from it :) These messages make sense between co-operating threads that are waiting for messages. But I would guess (correct me if I'm wrong) they don't provide async notification to unsuspecting tasks that were 100% busy doing something entirely unrelated. D-Bus sounds to me like another RPC or Corba IPC mechanism, tailored for sensible use between desktop apps and services. -- I won't rest till it's the best ... Programmer, Linux Scalability Paul Jackson 1.650.933.1373 ------------------------------------------------------- This SF.Net email sponsored by Black Hat Briefings & Training. Attend Black Hat Briefings & Training, Las Vegas July 24-29 - digital self defense, top technical experts, no vendor pitches, unmatched networking opportunities. Visit www.blackhat.com _______________________________________________ Linux-hotplug-devel mailing list http://linux-hotplug.sourceforge.net Linux-hotplug-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel