linux-hotplug.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Jackson <pj@sgi.com>
To: linux-hotplug@vger.kernel.org
Subject: Re: How to notify app of changed cpu/mem/io node configuration?
Date: Tue, 29 Jun 2004 08:37:52 +0000	[thread overview]
Message-ID: <20040629013752.3b519b9d.pj@sgi.com> (raw)
In-Reply-To: <20040628173808.04718b83.pj@sgi.com>

Greg KH wrote:
> Through /sbin/hotplug.  Is that not acceptable?

Hotplug is a fine tool.  Not what I had in mind however.

I'm afraid my initial question was quite incomplete.

And now that I see your initial response, perhaps off-topic for this
list as well - though not that far off.

My question is not how to translate kernel hardware config events into
executing selected user code (which is what I understand hotplug is good
at), but rather how to cause a long running big scientific or technical
job (think a 4 day Fortran program) to execute a bit of code that alls
mbind(2) and/or set_mempolicy(2), at the request of a user level system
service (think batch manager) 

Consider the following scenario:

  1) A very large system - hundreds of CPUs.
  2) Running a handful of long running jobs, perhaps days.
  3) Each job is carefully placed, to use certain CPUs and Memory.
  4) Shaving a few percent off the total runtime of a job can be critical.
  5) A batch manager controls the start, stop and placement of jobs.
  6) The batch manager learns, perhaps via hotplug, that a node is doing down.
  4) The batch manager now wants to move the job that was using that node.

We have enough in current 2.6 kernels to move the CPU placement, as the
sched_setaffinity(2) can be applied to other task pid's.  However the
mbind(2) and set_mempolicy(2) calls only apply to the current task, for
good reason - too hard otherwise.  So the batch manager needs to poke a
seldom used bit of co-operating code buried in the job to be moved, to
tell it to issue the requisite mbind and set_mempolicy calls to move on
over to the new CPU and Memory resources.

=> The critical step that I am pondering here is:

  How does this event first enter the process space of the long running job?

Certainly, the job isn't hanging out on some D-BUS just waiting for it. 
It is too busy computing some large problem.

So, I need a low bandwidth, uni-directional, asynchronous channel that
has essentially zero cost on the receiver, until such time as it is
used.

I suspect that I am describing signals and only signals.

The only other option I'm aware of would be adding kernel code that,
similar in implementation to signals and debugging (ptrace) events,
provides hooks to enable one task to cause another to invoke from within
the kernel the mbind and set_mempolicy system calls on itself.

I suspect that this alternative (doing it in the kernel) would go over
on lkml like a lead balloon.  As well it should.

> I'm sure there will be a DBUS event generated from it :)

These messages make sense between co-operating threads that are waiting
for messages.  But I would guess (correct me if I'm wrong) they don't
provide async notification to unsuspecting tasks that were 100% busy
doing something entirely unrelated.  D-Bus sounds to me like another RPC
or Corba IPC mechanism, tailored for sensible use between desktop apps
and services.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Linux-hotplug-devel mailing list  http://linux-hotplug.sourceforge.net
Linux-hotplug-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-hotplug-devel

  parent reply	other threads:[~2004-06-29  8:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-06-29  0:38 How to notify app of changed cpu/mem/io node configuration? Paul Jackson
2004-06-29  6:24 ` Greg KH
2004-06-29  8:37 ` Paul Jackson [this message]
2004-06-29 22:41 ` Dave Hansen
2004-06-29 23:15 ` Paul Jackson
2004-06-29 23:39 ` Dave Hansen
2004-06-29 23:53 ` Rusty Russell
2004-06-30  1:22 ` Paul Jackson
2004-06-30  1:38 ` Dave Hansen
2004-06-30  1:40 ` Paul Jackson
2004-06-30  2:16 ` Paul Jackson
2004-06-30 11:57 ` jlm_devel
2004-06-30 12:26 ` Paul Jackson
2004-06-30 13:08 ` Paul Jackson
2004-06-30 17:37 ` jlm_devel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040629013752.3b519b9d.pj@sgi.com \
    --to=pj@sgi.com \
    --cc=linux-hotplug@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).